Using Free-Choice Nets for Process Mining and Business Process Management

—Free-choice nets, a subclass of Petri nets, have been studied for decades. They are interesting because they have many desirable properties normal Petri nets do not have and can be analyzed efﬁciently. Although the majority of process models used in practice are inherently free-choice, most users (even modeling experts) are not aware of free-choice net theory and associated analysis techniques. This paper discusses free-choice nets in the context of process mining and business process management. For example, state-of-the-art process discovery algorithms like the inductive miner produce process models that are free-choice. Also, hand-made process models using languages like BPMN tend to be free-choice because choice and synchronization are separated in different modeling elements. Therefore, we introduce basic notions and results for this important class of process models. Moreover, we also present new results for free-choice nets particularly relevant for process mining. For example, we elaborate on home clusters and lucency as closely-related and desirable correctness notions. We also discuss the limitations of free-choice nets in process mining and business process management, and suggest research directions to extend free-choice nets with non-local dependencies.


I. INTRODUCTION
F REE-CHOICE nets can be used to model processes that include process patterns such as sequence, choice, loop, and concurrency.Compared to general Petri nets they require choice and synchronization to be separable.This is automatically the case in languages having explicit split and join operators (also called connectors or gateways) that do not mix choice and synchronization.For example, when using Business Process Modeling Notation (BPMN) with only AND and XOR gateways, the behavior is automatically free-choice.Although BPMN allows for many advanced constructs, the most widely used BPMN constructs can be easily mapped onto free-choice nets.
In this paper, we relate recent developments in free-choice nets to Business Process Management (BPM) in general and process mining in particular.The desire to manage and improve processes is not new.The field of scientific management emerged in the 1890-ties with pioneers like Frederick Winslow Taylor (1856Taylor ( -1915) ) [31].Taylor already systematically analyzed manually recorded data in order to uncover potential process improvements.With the availability of computers, the focus shifted to automation.In the 1970-ties there was the expectation that office would would become increasingly automated, not requiring human intervention.Pioneers like Skip Ellis [18] and Michael Zisman [34] worked on so-called office automation systems.The ideas lead to the development of Workflow Management (WFM) systems in the 1990-ties (see [8]).Later, BPM systems broadened the scope from automation to management.In hindsight, these approaches were not so successful.For example, as the longitudinal study in [28] shows, many workflow implementations failed.As a result, WFM/BPM technology is often considered too expensive and only feasible for highly-structured processes.At the same time, people continued to model processes using flowchart-like description languages.For example, modeling tools such as ARIS and Signavio have been used to model millions of processes all over the globe.Modeling is less costly than automation, but the effect is often limited.Due to the disconnect between reality and such hand-made models, the BPM market was shrinking until recently.However, this changed with the uptake of process mining [2].
Process mining dramatically changed the way we look at process models and operational processes.Even seemingly simple processes like Purchase-to-Pay (P2P) and Order-to-Cash (O2C) are often amazingly complex, and traditional hand-made process models fail to capture the true fabric of such processes.Process mining bridges the gap between between process science (i.e., tools and techniques to improve operational processes) and data science (i.e., tools and techniques to extract value from data). Figure 1 shows ProM's inductive miner [22] in action.Based on (heavily filtered) data from SAP's Purchase-to-Pay (P2P) process, a process model is derived.Process discovery is just one of several process mining tasks.First, event data need to be extracted from information systems like SAP.Process discovery techniques transform such event data into process models (e.g., BPMN, Petri nets, and UML activity diagrams).There are simple approaches like creating so-called Directly-Follows-Graphs (DFGs) that do not discover concurrency thus having obvious problems [4].Dozens, if not hundreds, of more sophisticated algorithms were proposed [12], [2], [13], [20], [21], [22], [33].Using replay and alignment techniques it is possible to do conformance checking and relate process models (hand-made or discovered) with event data.This can be used to discover differences between reality and model [2], [16], [30].Moreover, the model can be extended with additional perspectives, e.g., organizational aspects, decisions, and temporal aspects.Currently, there are over 35 commercial process mining vendors (ABBYY Timeline, ARIS Process Mining, BusinessOptix, Celonis Process Mining, Disco/Fluxicon, Everflow, Lana, Mavim, MPM, Minit, PAFnow, QPR, etc.) and process mining is applied in most of the larger organizations.Figure 2 shows a BPMN model discovered using the Celonis process mining software.The same model can also be used for conformance checking and show where reality and model deviate.
Unlike traditional WFM/BPM technologies, there is a direct connection to the data.This allows stakeholders to spot inefficiencies, delays, and compliance problems in real-time.Process mining revitalized the BPM discipline, as is proven by the valuation of process mining firms.For example, Celonis is currently the first and only German "Decacorn" (i.e., a start-up whose value is considered to be over $10 billion).Fig. 3.A free-choice net generated from the models in Figures 1 and 2.
So how this related to free-choice nets?Process models play a key role in BPM and process mining, and these models can often be viewed as free-choice.Commonly used process notations are DFGs, BPMN models, Petri nets, and process trees.For example, the inductive mining approach uses process trees [22].Although not visible, Figures 1 and  2 were actually generated using this approach.Process trees can be visualized using BPMN or Petri nets. Figure 3 shows the Petri net representation of the process tree.Any process tree corresponds to a so-called free-choice net having the same behavior.Later we will provide a formal definition for these notions.At this stage, it is sufficient to know that, in a freechoice net, choice and synchronization can be separated.
Any process tree can be converted to a free-choice net.Moreover, a large class of BPMN models is inherently freechoice.In a BPMN model there are flow objects such as events, activities, and gateways that are connected through directed arcs and together form a graph [26].There are many modeling elements, but most process modelers use only a small subset [24].For example, in many models, only exclusive gateways (for XOR-splits/joins) and parallel gateways (for AND-splits/joins) are used.Such models can be converted to free-choice nets [27].It is also possible to convert BPMN models with inclusive gateways (i.e., OR-splits/joins) into freechoice nets (as long as the splits and joins are matching).
Since most process discovery techniques discover process models that are free-choice and also people modeling processes tend to come up with free-choice models, this is an interesting class to be studied.Therefore, this paper focuses on free-choice models.The goal is to expose people interested in BPM and process mining to free-choice-net theory.
Section II introduces preliminaries, including Petri nets, free-choice nets, and lucency.Luceny is a rather new notion which states that there cannot be two states enabling the same set of activities.Section III focuses on the class of process models having so-called home clusters.This class extends the class of sound models that can always terminate (e.g., no deadlocks) with the class of models that have a regeneration point.Free-choice nets with home clusters are guaranteed to be lucent.Hence, these nets are interesting for a wide range of applications and an interesting target class for process mining.Section IV discusses the limitations of free-choice nets, e.g., the inability to express non-local (i.e., long-term) dependencies.These insights may help to develop better process discovery techniques that produce more precise models.Section V concludes this paper.

II. PRELIMINARIES
Free-choice nets are well studied [14], [15], [19], [32].The definite book on the structure theory of free-choice nets is [17].To keep the paper self-contained, first standard Petri net notions are introduced.If unclear, consider reading one of the standard introductions [11], [25], [29].Most of the notations used are adopted from [6].Definition 1 (Petri Net): A Petri net is a tuple N = (P, T, F ) with P the non-empty set of places, T the non-empty set of transitions such that P ∩ T = ∅, and F ⊆ (P × T ) ∪ (T × P ) the flow relation such that the graph (P ∪ T, F ) is (weakly) connected.
Definition 2 (Pre-and Post-Set): Let N = (P, T, F ) be a Petri net.For any x ∈ P ∪ T : For example, in Figure 4, Definition 3 (Marking): Let N = (P, T, F ) be a Petri net.A marking M is a multiset of places, i.e., M ∈ B(P ). 1 (N, M ) is a marked net.
In the marking shown in Figure 4, transitions t1 and t2 are enabled.An enabled transition t can fire consuming a token from each input place in •t and producing a token for each output place in t•.
the set of all reachable markings.(N, M )[σ denotes that the sequence σ is enabled when starting in marking M (without specifying the resulting marking).
Let N be the Petri net shown in Figure 4.
Definition 5 (Live, Bounded, Safe, Dead, Deadlock-free, Well-Formed): A marked net (N, M ) is live if for every reachable marking M ′ ∈ R(N, M ) and for every transition t ∈ T there exists a marking when it can never be marked (no reachable marking marks p).A transition t ∈ T is dead in (N, M ) when it can never be enabled (no reachable marking enables t).A marked net (N, M ) is deadlock-free if each reachable marking enables at least one transition.A Petri 1 In a multiset elements may appear multiple times, e.g., M = [p1, p2, p2, p2] = [p1, p2 3 ] is a multiset with four elements (three have the same value).
2 M 1 ⊆ M 2 (inclusion), M 1 ∪M 2 (union),M 1 \M 2 (difference) are defined for multisets in the usual way (i.e., taking into account the cardinalities.Sets are treated as multisets where all elements have cardinality 1. net N is structurally bounded if (N, M ) is bounded for any marking M .A Petri net N is structurally live if there exists a marking M such that (N, M ) is live.A Petri net N is wellformed if there exists a marking M such that (N, M ) is live and bounded.
Definition 6 (Proper Petri Net): A Petri net N = (P, T, F ) is proper if all transitions have input and output places, i.e., for all t ∈ T : •t = ∅ and t• = ∅.
Definition 7 (Strongly Connected): A Petri net N = (P, T, F ) is strongly connected if there is a directed path between any pair of nodes.
Note that a strongly connected net is also proper.Figure 4 shows that the converse does not hold, the net is proper, but not strongly connected.
Definition 8 (Home Marking): Let (N, M ) be a marked net.A marking M H is a home marking if for every reachable marking The marked Petri net in Figure 4 has one home marking:

B. Free-Choice Nets
The concepts and notations discussed apply to any Petri net.Now we focus on the class of free-choice nets.As indicated in the introduction, this is an important class because most process models used in the context of BPM and process mining are free-choice.
Definition 9 (Free-choice Net): The Petri net in Figure 4 is not free-choice because •t 5 ∩ •t 6 = {p6, p7} = ∅, but •t 5 = •t 6 .If we remove the places p4 and p5, then the net becomes free-choice.The places model a so-called long-term (or non-local) dependency: The choice between t1 and t2 in the beginning is controlling the choice between t5 and t6 at the end.The process model discovered using ProM (Figure 1) and Celonis (Figure 2) based on filtered SAP data is free-choice.Figure 3 shows the corresponding free-choice net.

C. Lucency
The notion of lucency was first introduced in [3].A marked Petri net is lucent if there are no two different reachable markings enabling the same set of transitions, i.e., states are fully characterized by the transitions they enable.
Definition 10 (Lucent Petri nets): Let (N, M ) be a marked Petri net.(N, M ) is lucent if and only if for any The marked Petri nets in Figures 3 and 5 are lucent, i.e., there are no two reachable markings that enable the same set of transitions.The marked Petri net in Figure 4 is not lucent.Markings M 1 = [p2, p3, p4] and M 2 = [p2, p3, p5] are both reachable and enable transitions t3 and t4.
Lucency is often a desirable property.Think, for example, of an information system that has a user interface showing what the user can do.In this setting, lucency implies that the offered actions fully determine the internal state and the system will behave consistently from the user's viewpoint.If the information system would not be lucent, the user could encounter situations where the set of offered actions is the same, but the behavior is very different.Another example is the worklist of a workflow management system that shows the workitems that can or should be executed.Lucency implies that the state of a case can be derived based on the workitems offered for it [6].
Characterizing the class of systems that are lucent is a foundational and also challenging question [3], [6], [7].

III. FREE-CHOICE NETS WITH HOME CLUSTERS
Workflow nets form a subclass of Petri nets starting with a source place start and ending with a sink place end [9].The modeled workflow can be instantiated by putting tokens on the input place start.In the context of workflow nets, a correctness criterion called soundness has been defined [9].A workflow net is sound if and only if the following three requirements are satisfied: for each case it is always still possible to reach the state which just marks place end (option to complete), if place end is marked all other places are empty for a given case (proper completion), and it should be possible to execute an arbitrary activity by following the appropriate route through the workflow net (no dead transitions) [9].In [1], it was shown that soundness is decidable and can be translated into a liveness and boundedness problem, i.e., a workflow is sound if and only if the corresponding short-circuited net (i.e., the net where place end is connected to place start) is live and bounded.This can be checked in polynomial time for freechoice nets [1].Figures 3 and 4 show two sound workflow nets.Figures 5 and 6 show free-choice nets that do not have a designated start and end place.Hence, soundness is not defined for these models.
A strongly-connected Petri net cannot be a workflow net.However, the lion's share of Petri net theory focuses on strongly-connected Petri nets.Therefore, [6] investigated a new subclass of Petri nets having a so-called home cluster.First, we define the notion of a cluster.A cluster is a maximal set of connected nodes, only considering arcs connecting places to transitions.
Definition 11 (Cluster): Let N = (P, T, F ) be a Petri net and x ∈ P ∪ T .The cluster of node x, denoted [x] c is the smallest set such that (1) is the marking which only marks the places in C.
A home cluster is a cluster that serves as a "target" that can always be reached again.Hence, it can be seen as a generalization of soundness.
Definition 12 (Home Clusters): Let (N, M ) be marked Petri net.C is a home cluster of (N, M ) if and only if C ∈ [N ] c (i.e., C is a cluster) and Mrk (C) is a home marking of (N, M ).If such a C exists, we say that (N, M ) has a home cluster.
Property 1 (Sound Workflow Nets Have A Home Cluster): Let (N, M ) be a sound workflow net.(N, M ) has a home cluster.
Also, all short-circulated sound workflow nets are guaranteed to have a home cluster.All marked Petri nets show thus far (i.e., Figures 3-6) have a home cluster.However, the nets in Figures 5 and 6 are not workflow nets.
Most of the results for Petri nets and in particular freechoice nets are defined for well-formed nets [11], [14], [15], [17], [19], [25], [29], [32].Recall that a Petri net is wellformed if there exists a marking that is live and bounded.Some well-known properties of well-formed free-choice nets: • A well-formed free-choice net is strongly connected.
• A bounded and strongly-connected marked free-choice net is live if and only if it is deadlock free.
• A marked free-choice net is live if and only if every proper siphon includes a marked trap.
• Well-formed free-choice nets are covered by Pcomponents and T-components.
• Well-formedness can be decided in polynomial time for free-choice nets.
• Live and bounded free-choice nets have home markings.
12 PROCEEDINGS OF THE FEDCSIS.ONLINE, 2021 Interestingly, marked free-choice nets having a home cluster do not need to be well-formed.Yet, free-choice nets having a home cluster have interesting properties as demonstrated in [6].A surprising result is that free-choice nets having a home cluster are lucent.
Theorem 1 (Home Clusters Ensure Lucency [6]): Let (N, M ) be a marked proper free-choice net having a home cluster.(N, M ) is lucent.
The theorem can be used to show that the process models in Figures 3, 5, and 6 are lucent.
Theorem 1 is surprising since there are T-systems (i.e., marked graphs) that are live, bounded, safe, well-formed, and strongly connected that are not lucent.A proof of Theorem 1 is outside of the scope of this paper (see [6] for details).However, it is important to note that the proof does not rely on any of the classical results for well-formed nets.Instead, several new concepts are introduced, such as: • Expediting transitions in a firing sequence of a freechoice net.As long as the order per cluster is maintained, transitions can fire earlier without causing any problems (e.g., deadlocks).
• The notion of disentangled paths, i.e., paths in the net that start and end with a place and do not contain elements that belong to the same cluster.A C-rooted disentangled path ends with a place in cluster C.
• A C-rooted disentangled path is safe if C is a home cluster.This implies that marked proper free-choice nets having a home cluster must be safe.
• The notion of conflict-pairs, i.e., a pair of markings such that no transition is enabled in both markings, but if a transition is enabled in one marking, the other marking must mark at least one of its input places.
• A marked proper free-choice net having a home cluster cannot have any conflict pairs.These results make free-choice nets having a home cluster interesting candidate models in the context of BPM and process mining.However, as discussed next, there are also some limitations.

IV. ADDING NON-LOCAL DEPENDENCIES
Although many process discovery techniques return models that can be seen as free-choice and process modelers using BPMN are more-or-less forced to draw free-choice models, there are some limitations when using free-choice nets.Consider again the Petri net in Figure 4, which is not free-choice due to the places p4 and p5.The process model allows for the following four traces L 1 = { t1, t3, t4, t5 , t1, t4, t3, t5 , t2, t3, t4, t6 , t2, t4, t3, t6 }.Note that t1 is always followed by t5, and t2 is always followed by t6.In BPMN, we cannot express such dependencies (without resorting to data or other more advanced constructs).Ignoring the non-local dependencies represented by the places p4 and p5 leads to the BPMN model shown in Figure 7.
The corresponding free-choice net is shown in Figure 8.Both the BPMN model and the free-choice net allow for the following eight traces L 2 = { t1, t3, t4, t5 , t1, t3, t4, t6 , t1, t4, t3, t5 , t1, t4, t3, t6 , t2, t3, t4, t5 , t2, t3, t4, t6 , t2, t4, t3, t5 , t2, t4, t3, t6 }.Hence, the number of possibilities doubled.Most process discovery techniques will be unable to capture such non-local dependencies.Given an event log with only traces from L 1 , most discovery techniques will produce a process model that allows for L 2 .Some of the region-based process mining techniques can discover the process model allowing for only L 1 .However, these techniques have many other problems: they tend to produce over-fitting models, cannot handle infrequent behavior, and are very time-consuming.Therefore, it may be better to first discover a free-choice backbone model that is then extended to make it more precise.Concretely, one can first discover a Petri net using the inductive mining approach and then add non-local dependencies.One can use, for example, a variant of the approach in [23] to add places.It is also possible to combine two types of arcs as in hybrid process models [10].In [10], we use hybrid Petri nets and first discover a causal graph based on the event log.Based on different (threshold) parameters, we scan the event log for possible causalities.In the second phase, we try to learn places based on explicit quality criteria.Places added can be interpreted in a precise manner and have a guaranteed quality.Causal relations that cannot or should not be expressed in terms of places are added as sure or unsure arcs.A similar approach can be used for strongly correlating choices in a free-choice net.
There is also an interesting connection to the notion of confusion.Confusion is the phenomenon that the order of executing concurrent transitions may influence choices in the model.Here, we consider a simpler notion and consider a Petri net to be confusion-free when transitions that share an input place either cannot be both enabled or have the same set of input places.
Definition 13 (Confusion-Free): A marked Petri net (N, M ) with N = (P, T, F ) is confusion-free if for any two transitions t 1 , t 2 ∈ T with •t 1 ∩ •t 2 = ∅ and •t 1 = •t 2 there is no reachable marking M ′ ∈ R(N, M ) such that {t 1 , t 2 } ⊆ en(N, M ).All models in this paper are confusion free.Note that freechoice nets are by definition confusion-free.An interesting question is to develop automatic conversions from models that are "almost free-choice".
Thus far concepts such as confusion-free, lucency, and home clusters have not been exploited in process mining using traditional event logs.In [5], an algorithm is presented assuming translucent event logs that explicitly show the enabling of activities.However, such event logs are rarely available.

V. CONCLUSION
In this paper, we discussed recent results in free-choice net theory and related these results to Business Process Management (BPM) in general and process mining in particular.Although most discovery techniques produce freechoice models, this property is rarely exploited explicitly.Assuming that the process model is a free-choice net with a home cluster, provides many valuable properties relevant for process discovery.As shown in this paper, such models are, for example, guaranteed to be lucent.This implies that there cannot be two states enabling the same set of activities.Also, disentangled paths rooted in a home cluster are safe, i.e., such paths cannot contain two tokens.The open question is how to exploit this in process mining.
We also discussed the need to add non-local dependencies.Such dependencies destroy elegant properties such as lucency.Hence, they can be seen as a secondary layer of annotations.For example, we can connect clusters that are strongly correlated.The goal is to make the process models more precise without overfitting the data or destroying the structure of the model.

Figure 4
Figure4shows a Petri net with eight places, six transitions, and twenty arcs.Definition 1 (Petri Net): A Petri net is a tuple N = (P, T, F ) with P the non-empty set of places, T the non-empty set of transitions such that P ∩ T = ∅, and F ⊆ (P × T ) ∪ (T × P ) the flow relation such that the graph (P ∪ T, F ) is (weakly) connected.Definition 2 (Pre-and Post-Set): Let N = (P, T, F ) be a Petri net.For any x ∈ P ∪ T :•x = {y | (y, x) ∈ F } and x• = {y | (x, y) ∈ F }.For example, in Figure4, •p2 = {t1, t2}, •t5 = {p4, p6, p7}, t1• = {p2, p3, p4}, and p8• = ∅.Definition 3 (Marking): Let N = (P, T, F ) be a Petri net.A marking M is a multiset of places, i.e., M ∈ B(P ).1 (N, M ) is a marked net.In the marking shown in Figure4, transitions t1 and t2 are enabled.An enabled transition t can fire consuming a token from each input place in •t and producing a token for each output place in t•.Definition 4 (Enabling, Firing Rule, Reachability): Let (N, M ) be a marked net with N = (P, T, F ).Transition t ∈ T is enabled if •t ⊆ M . 2 This is denoted by (N, M )[t (each of t's input places •t contains at least one token).en(N, M ) = {t ∈ T | (N, M )[t } is the set of enabled transitions.Firing an enabled transition t results in marking M ′ = (M \ •t) ∪ t•.(N, M )[t (N, M ′ ) denotes that t is enabled in M and firing t results in marking M ′ .A marking M ′ is reachable from M if there exists a firing sequence σ such that (N, M )[σ (N, M ′ ).R(N, M ) = {M ′ ∈ B(P ) | ∃ σ∈T * (N, M )[σ (N, M ′ )} isthe set of all reachable markings.(N, M )[σ denotes that the sequence σ is enabled when starting in marking M (without specifying the resulting marking).Let N be the Petri net shown in Figure4.(N,[p1])[σ 1 (N, [p4, p6, p7]) with σ 1 = t1, t3, t4 and (N, [p1])[σ 2 (N, [p8]) with σ 2 = t2, t4, t3, t6 .We also define the usual properties for Petri nets.Definition 5 (Live, Bounded, Safe, Dead, Deadlock-free, Well-Formed): A marked net (N, M ) is live if for every reachable marking M ′ ∈ R(N, M ) and for every transition t ∈ T there exists a marking M ′′ ∈ R(N, M ′ ) that enables t.A marked net (N, M ) is k-bounded if for every reachable marking M ′ ∈ R(N, M ) and every p ∈ P :M ′ (p) ≤ k.A marked net (N, M ) is bounded if there exists a k such that (N, M ) is k-bounded.A 1-bounded marked net is called safe.A place p ∈ P is dead in (N, M) when it can never be marked (no reachable marking marks p).A transition t ∈ T is dead in (N, M ) when it can never be enabled (no reachable marking enables t).A marked net (N, M ) is deadlock-free if each reachable marking enables at least one transition.A Petri
WIL M.P.VAN  DER AALST: USING FREE-CHOICE NETS FOR PROCESS MINING AND BUSINESS PROCESS MANAGEMENT

Fig. 6 .
Fig.6.A lucent free-choice net having two home clusters.

Fig. 7 .
Fig. 7.A BPMN model that aims to describe the behavior in Figure 4 without local dependencies.

Fig. 8 .
Fig.8.The free-choice net corresponding to the BPMN model in Figure7.
WIL M.P.VAN  DER AALST: USING FREE-CHOICE NETS FOR PROCESS MINING AND BUSINESS PROCESS MANAGEMENT