How do I create a mapping in XESame from .csv to XES?

JBuijs · October 2010

From
http://prom.win.tue.nl/forum/index.php?p=/discussion/comment/71/#Comment_71
Where this question was asked:

---------------------------------------
The certainly helped me.

one more query regarding generating .xes / .mxml file of my own data source.

I have a .csv file. how do i generate the ".mappings" for this file?

so that, using that i can generate .xes from XESame.

I tried to go through your thesis report. but i didnt find the explicit steps for generation of mappings.

---
Thanks.
Narayan.

-------------------------------------

Answer:
A definition can be made in the definition workspace (middle tab on the top). Here you see a tree structure with different elements of the mapping.
How to define a mapping using this tree structure and what to enter in each field can be read in my Master's thesis, mainly in Chapter 5, Section 5.3.2. More complex examples are discussed in Chapter 6 which will also help you to understand the definition of the mapping.
See http://prom.win.tue.nl/research/wiki/_media/xesame/xesma_thesis_final.pdf

Good luck Narayan!

Joos

lmotamar · October 2010

Dear Joos,

I am working on XESame with my own data file called IGSTK.csv

2010-10-18 14:22:47: (PROGRESS) Starting step 1 of 5: initialization.
2010-10-18 14:22:50: (PROGRESS) Starting step 2 of 5: Extracting log info.
2010-10-18 14:22:50: (DEBUG) We are about to run the following query to extract the items for Log:
SELECT 'No attributes' AS DUMMY FROM
2010-10-18 14:22:50: (ERROR) There is an error in the query (turn on 'debug' mode to see it) to fetch the information for Log, this is the error we got: (-3506) [Microsoft][ODBC Text Driver] Syntax error in FROM clause.
2010-10-18 14:22:51: (PROGRESS) We stopped execution because of a critical error.

Please advise / correct me on this above error.

The IGSTK.csv, IGSTK.mapping(which i configured and saved as) and few screen-shots of my execution are uploaded at my
website (http://www.public.asu.edu/~lmotamar/XESame/).

please refer to see the attributes and properties of the mapping and correct me if any.

Thanks
---
Narayan.

JBuijs · October 2010

Dear Narayan,

Unfortunately, XESame is not that smart yet that it detects that for the log element you didn't provide any attribute values. Therefore, it still builds a query and tries to run it. Since you didn't specify the 'from' property of the log element this gives an error when executing the query.
Solution: enter 'IGSTK.csv AS igstk' in the 'from' property of the log element (as you did for the trace element).

Another question: is it correct that the event definition is also empty?

Let me know how it works out.

Joos

lmotamar · October 2010

Dear Joos,

This time, I made the changes u mentioned. I added the log, trace and event definition info.

I got this error.
---

2010-10-21 00:29:05: (DEBUG) Derby system dir before reboot: null
2010-10-21 00:29:10: (DEBUG) Derby system dir after reboot: null
2010-10-21 00:29:15: (PROGRESS) Starting step 1 of 5: initialization.
2010-10-21 00:29:18: (PROGRESS) Starting step 2 of 5: Extracting log info.
2010-10-21 00:29:18: (DEBUG) We are about to run the following query to extract the items for Log:
SELECT 'igstk log' AS [concept_name], 'standard' AS [lifecycle_model] FROM IGSTK.csv AS igstk
2010-10-21 00:29:19: (DEBUG) We will insert a new attribute (concept:name with value igstk log) into the cache DB for the Log item with ID 0
2010-10-21 00:29:19: (DEBUG) We will insert a new attribute (lifecycle:model with value standard) into the cache DB for the Log item with ID 0
2010-10-21 00:29:19: (PROGRESS) Starting step 3 of 5: Extracting traces.
2010-10-21 00:29:19: (DEBUG) We are about to run the following query to extract the items for Trace:
SELECT DISTINCT AS [traceID], 'igstk: ' & igstk.id AS [concept_name], '' AS [IGSTK_details] FROM IGSTK.csv AS igstk
2010-10-21 00:29:19: (ERROR) There is an error in the query (turn on 'debug' mode to see it) to fetch the information for Trace, this is the error we got: (-3504) [Microsoft][ODBC Text Driver] The SELECT statement includes a reserved word or an argument name that is misspelled or missing, or the punctuation is incorrect.
2010-10-21 00:29:21: (PROGRESS) We stopped execution because of a critical error.

---

Well, I made the mapping definition based on your example.
My Data (IGSTK.csv) is a single file, unlike yours. so i just created a single event. and i tried to define the entire mapping much similar to users.

can you find any corrections to be made.

again, I am placing the updated "igstk_latest.mapping" file in my website. pls have a look. http://www.public.asu.edu/~lmotamar/XESame/

---
Narayan.

JBuijs · October 2010

Dear Narayan,

As far as I can see you didn't specify a 'traceID' for the trace element.
The traceID is used to link events to their traces (/cases).
So for each trace you need to specify what the unique identification is for a trace (most likely the primary key of the trace table).
For each event definition you need to specify where the traceID is stored to relate it back to the traces.

Hope this helps!

lmotamar · October 2010

actually when I ran the example, i did mention the traceID as 'id' for the trace element.

i suppose it ddnt save properly.

and now again I ran with putting traceID as id for trace element, I am still getting same error. can u update the traceID with value as "id" and run the mapping? and check for me...?

pls.
---
Narayan.

JBuijs · October 2010

Dear Narayan,

I inspected your igstk_latest.mapping file and did the following:
- added id in the trace property traceID
- added '-symbols for the event name en transition since these are not column names but fixed values.

I uploaded the new mapping file here:
http://www.win.tue.nl/~jbuijs/files/tmp/igstk_Joos_TraceIDAdded.mapping

I can not test it because I don't have the .ini file that tells the ODBC driver how to interpret the CSV file.

If you encounter any more errors could you:
- provide the .ini file so I can run the mapping file
- provide the console output for that run with debug message's so I can pin-point the error

Hope this helps you further.

lmotamar · October 2010

Joos,

The mapping file you provided worked. Thank you.

and now I understand the difference of specifying values to attributes.

one more small issue:
-> In the same mapping which you provided, there is an event attribute "org:resource" with value igstk.Reporter
where Reporter is the column name from IGSTK.csv

similarly for "time:timestamp" event attribute, I gave the value igstk.DateSubmitted where DateSubmitted is the column name from IGSTK.csv
{Initially the column name was Date Submitted, which I changed to DateSubmitted, removing the space}.

Now my problem is, I saved the mapping as .MXML and opened in ProM 5.2

In the filter section, it doesnt show me start event, end event and event type values.

http://www.public.asu.edu/~lmotamar/XESame/prom5_screenshot.jpg)
{because though, I am able to run the conversion and generate .MXML/.XES files, with out proper attributes definitions I am not able to get proper Mining/Analysis plugins results}

I see some problem in attributes definition,.

pls correct me,.

---
Narayan,

JBuijs · October 2010

Dear Narayan,

In your mapping you specified one 'fixed' event: 'bug create' and gave it the life cycle transistion 'resolved'. This means that, in your case, every record in your source file will become one trace with one event named 'bug create'.

If there is more information in your log file you should add more event definitions. If there is for instance another timestamp column then this could indicate another event which you can include by creating another event definition.

Furthermore, you used the lifecycle transition 'resolved'. This is not one of the standard lifecycle states. See the XES standard definition (http://www.xes-standard.org/_media/xes/xes_standard_proposal.pdf) for the standard lifecycle schema.

So you have only one event per trace since you specified only one event per trace.
Your event type is 'unknown' since its not one of the standard event lifecycle states. ProM 6 should be able to handle this better. In general, its best to use one of these standard states.

Hope this helps.

Joos

lmotamar · October 2010

Dear Joos,

the above info u provided helped. I was able to run the execution using standard life cycle states.

I have a new issue now, with JOIN statements.

I have 2 events: 'Bugcreated' (associated with IGSTK.csv) and 'Bugcompleted' (associated with IGSTK_1.csv).

I added this join statement: IGSTK_1.csv AS igstk_a ON igstk_a.id = igstk.id
in the 'Bugcreated' event.

Now i am having following errors: pls advise
---
2010-10-31 00:26:54: (DEBUG) We will insert a new attribute (concept:name with value igstk: 10842) into the cache DB for the Trace item with ID 10842
2010-10-31 00:26:54: (DEBUG) We will insert a new attribute (IGSTK_details with value ) into the cache DB for the Trace item with ID 10842
2010-10-31 00:26:54: (PROGRESS) Starting step 4 of 6: Extracting Event: Bugcreated.
2010-10-31 00:26:54: (DEBUG) We are about to run the following query to extract the items for Event: Bugcreated:
SELECT id AS [traceID], IGSTK1.csv AS igstka ON igstka.id = igstk.id AS [orderAttribute], 'igstk ' & igstk.id AS [concept_instance], 'bug create' AS [concept_name], 'assign' AS [lifecycle_transition], igstk.Reporter AS [org_resource], igstk.DateSubmitted AS [time_timestamp] FROM IGSTK.csv AS igstk
2010-10-31 00:26:54: (ERROR) There is an error in the query (turn on 'debug' mode to see it) to fetch the information for Event: Bugcreated, this is the error we got: (-3504) [Microsoft][ODBC Text Driver] The SELECT statement includes a reserved word or an argument name that is misspelled or missing, or the punctuation is incorrect.
2010-10-31 00:26:56: (PROGRESS) We stopped execution because of a critical error.

---
Narayan.

JBuijs · November 2010

Dear Narayan,

As I see it you entered 'IGSTK1.csv AS igstka ON igstka.id = igstk.id' into the 'order attribute' property. This should be added in the 'link' property. The order by property should be set to the same value as the timestamp (datesubmitted).

I don't think you need to join tables for this however. You can just use only the igstk.csv table.
For one event definition you specify 'bugcreated' as its name and the corresponding column for the timestamp of this event.
In the 'bugcompleted' event definition you specify the name and timestamp column of this event.
In both definitions you use the igstk.csv table.

Let me know if this works for you.

amrad · November 2011

Dear joos,

the content of this link http://prom.win.tue.nl/research/wiki/_media/xesame/xesma_thesis_final.pdf is not availible now.

JBuijs · November 2011

Hi Amrad,

We shut down the prom.win.tue.nl server.
Please look again at http://www.processmining.org/xesame/start

See also http://www.win.tue.nl/promforum/discussion/171/prom.win.tue.nl-offline#Item_1

amrad · November 2011

Hi Joos,

Thanks a lot.
The two links you provided works. i have installed the Prom6 and it works very well, also, the examples of XESame. i will try to create a mapping for my project, until you send for me the mapping on the columns i have ( No, Time, Source, Destination, Protocol, length and Info). these are the columns that we spoke in the email the last time.

Thank you for you help and your collaboration.
Amrad

JBuijs · November 2011

Hi Amrad,

I'm afraid that I can not help you with your mapping.
If you read my thesis, Chapter 3 should provide you with an idea on how to specify the mapping.

Good luck!

amrad · December 2011

Dear Joos,

i have enjoying reading your thesis. And, i learned many things from it. i executed your examples and it works.
After that, I have created a mapping for my csv file. but, i have some problems (there is some errors).
So can you please, verify the execution and how to correct errors and what can be error in the mapping.
In attachement, there is a zip file contain three files (New.csv, schema.ini, newmapping.mapping)

thank you for your help.

JBuijs · December 2011

Hi Amrad,

I'm really busy (paper deadlines) so I can not test your mapping and correct it for you.
If searching for the error message XESame provides you does not give you hints on how to solve it then you can post it here and I can give you advice.

amrad · December 2011

hi joos,

thank you for your reply.
Here is the console outpout:

2011-12-06 18:23:05: (PROGRESS) Starting step 1 of 5: initialization.
2011-12-06 18:23:05: (PROGRESS) Starting step 2 of 5: Extracting log info.
2011-12-06 18:23:05: (DEBUG) We are about to run the following query to extract the items for Log:
SELECT 'reseau' AS [concept_name], 'standard' AS [lifecycle_model] FROM
2011-12-06 18:23:05: (ERROR) There is an error in the query (turn on 'debug' mode to see it) to fetch the information for Log
2011-12-06 18:23:05: (ERROR) [Microsoft][ODBC Text Driver] Syntax error in FROM clause.
2011-12-06 18:23:05: (WARNING) Cancelling execution! Just after running the query.
2011-12-06 18:23:05: (NOTICE) Execution safely terminated.

joos, I apologize for bothering you but it is a simple example.

amrad · December 2011

amrad · December 2011

Hi joos,

For more clarification, you will find attached the console outpout. i made some changes.

thank you.

amrad · December 2011

Dear joos,

i read your ansewrs posted to Narayan for the same error and i made some changes in my mapping but it is the same error. you will find attached the last console outpout.

so, can you please look in the mapping and tell me how can i solve this error.

thanks a lot.

JBuijs · December 2011

Hi Amrad,

Good to see that you managed to solve the first error.

The error you know get is caused by forgetting to define a traceID in the trace properties.
The traceID should generate a unique number to identify each trace (in your case for instance the paquet detail no). This is used to connect the events to the traces.

amrad · December 2011

Hi Joos,

Thanks a lot for your reply.

i have identify the traceID in the trace properties but it still the same error and the same message.

Please, advice me what can i do in this case.

thank you in advance.

JBuijs · December 2011

Hi Amrad,

Are you sure it is the exact same error?

Are you sure the properties were save? (check this by going to another object and then back. You need to explicitly exit the field, only then are values saved).
Are you sure you also set the traceID property for you event mappings?

amrad · December 2011

Hi joos,

i'm sure that is the same error. and all the proprities are saved. Also, i am sure that i seted the traceID for my event mappings.

the error is not yet solved.
thank you joos.
amrad

JBuijs · December 2011

Hi Amrad,

Did you check if, for your database system, the other
table fields are reserved words? You can try to surround them with []
(as is automatically done for the AS [...] in the query.
Words like 'time' and 'length' are, as the error message suggests, often reserved words.

amrad · January 2012

Hello Joos,

First of all I wish you a happy new year 2012.
I'm back in the forum with the same problem. I have checked the table fields and they are not a reserved words in my database system.
Have you please other suggestions.

thank you.

JBuijs · January 2012

Hi Amrad,

Happy new year to you too!

Maybe it won't work but I noticed that the Paquet_details attribute has '' as value. Could you try to enter something there, such as 'bla'? It might be that the ODBC driver fails there.

If this does not work, try to remove as much custom attributes as you can and check all remaining entries for errors.
If the error then still persists, try to run in directly on the datasource, for instance by using SquirrelSQL (just google for the tool or search the forum here, I suggested it somewhere before).

ptocto · September 2015

Dear Joos
I am new in process mining, please could you advice me about a page to understand o to do a mapping file?
Thanks in advance.
Paul

JBuijs · September 2015

Dear Paul,

Welcome to the forum!

You can browse my masters thesis which can be found here:
http://www.processmining.org/xesame/start

How do I create a mapping in XESame from .csv to XES?

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion