How can I test the relationship between events and categorical data

I have a dataset comprised of mostly categorical data, in particular, different name tags of events that happen within a process. Those are accompanied of their timestamps and there are hundreds of unique tags.

I also have some shutdown events that happen regularly within said process, in my data, I have around 50 registered shutdowns. Which method can I use to try to identify which tags are influencing the shutdown?

Topic statistics machine-learning

Category Data Science


One option is constructing a contingency table, aka crosstab, which displays the frequency distribution of categorical variables. From a contingency table, different measures of association can be found. Examples of measures of association include odd ratio, phi coefficient, and uncertainty coefficient.


This is a classical application of Association rule analysis, where you want to find what-cause-what.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.