White Papers

How MX Cleanses, Categorizes, and Classifies Transaction Data

Banking today is built on a foundation of digital data. The old enormous filing cabinets filled with account documents and ledgers have been replaced by data centers, and the financial industry is floundering to make the most of the shift. As Chris Skinner, author of the book Digital Bank, puts it: “We built an industry on the physical distribution of paper in a localized world, and we’re now having to get to grips with the digital distribution of data in a networked world.” For bankers, it’s not business as usual.


Given the enormous importance of this shift to digital, you should ask yourself how you can best leverage your data to stay ahead of your competition and wow your account holders.

This white paper shows one essential part of the solution. We walk you through how you can cleanse, categorize, and classify all the transaction data you send your account holders. More importantly, we show you why this matters.

Let’s get started.

First, Cleanse It

Anyone familiar with transaction feeds knows that the raw data isn’t pretty. The feeds typically consist of a messy string of random-looking characters as shown below:


What is an account holder supposed to do with that mess? In many cases, all they can do is dial in to your call center and ask what the feed is supposed to mean.

When users come to your site, they want data that’s insightful and understandable — with the least amount of effort required on their part. If you’re presenting uncleansed data to your users, you’re making it difficult for them, and they’re not going to truly engage with your digital solution.

Ideally, the transaction feed shown above should be cleansed to its simplest and clearest form:


That’s it. Something as simple as that will help drive engagement and loyalty — not to mention it will save your call center employees from having to answer questions about transaction feeds all day. Offering clean data is the first step to making a digital banking experience that will delight end users.

Second, Categorize It

Today’s account holders are looking for help to manage their money. In fact, Javelin Research has found that your most profitable account holders want this feature the most,1 which makes sense because people who manage their money well are typically the best candidates for high quality loans. By adding automatic categorization to your transaction feeds, you help these account holders better manage their money, improve user loyalty, and drive revenue growth.

However, not any financial management solution will do. Unless the solution can accurately categorize the vast majority of transactions, users will find that the solution requires too much work and will become frustrated or give up.

Because we take accurate categorization seriously, we run each transaction through a series of four separate filters as outlined below:

1. User Preference - If a user sets a personal preference for a particular transaction, it will be cleansed, categorized and classified according to that preference.

2. Parser - If a user doesn’t set a personal preference, the transaction is parsed against a high performance data tree to determine if it fits any our predefined rules. So, for instance, anything with “chevron” in the feed is automatically categorized as gas. This is generally used for national companies where it’s absolutely clear that there will be no conflict with other company names.

3. Matcher - If the transaction isn’t found by our parser, we have a set of manually created matchers that correlate with specific transaction feeds. Our in-house team of analysts reviews transactions in our system to optimize the user experience. This way even purchases at small corporations (including mom and pop shops) are categorized correctly. We also use the matcher for companies that might have overlapping names (“Smith’s Groceries” and “Smith’s Auto,” for instance. As a result the matcher is incredibly precise.

4. Crowdsourced - If the transaction isn’t found in any of our other processes, it is categorized by a system that takes into consideration collective user preference.

In order to test whether these four tiers work as intended, we tested our system against two other leading data providers. We ran more than 800 real transactions occurring from the East Coast to the West Coast through three different aggregation services (including ours).

We tested for both cleansing and categorization accuracy. If a transaction string had been cleansed from, say, “NIXON PEABODY LLP 04ROCHESTERXXX726 XXX-XXX-1189 XXX027” to “Nixon Peabody” (or a close approximate), we marked it as correctly cleansed. If a transaction had been categorized in a way that would make sense to an end user (Costco as “Shopping,” for instance), we marked it as correctly categorized.

What we found was that right out of the box, MX had better cleansing and categorization rates than the competition. We also found that once we ran the set of transactions through our full onboarding process (wherein we refine our algorithms to match the specific transaction set we’re given), we reached rates of 99% cleansed and 98% categorized.

Here are the details:                                   

As you can see, we take categorization seriously. When account holders can log in and see every transaction cleansed and categorized, you know you’re ahead of the competition.        

Third, Classify It

By classifying your transaction data, we help you put that data to better use. It starts with our multi-sourced account aggregation system, which lets your account holders see all their accounts — even those they have with your competitors — in your digital banking portal. From there you can see which transactions from external accounts are bill pay, ACH, POS, or direct deposit. In other words, you can know where your account holders use each of these services.

The possible uses for this information are nearly limitless. If you notice that a large percentage of account holders pay their bills through a key competitor, you’ll know that you should run a direct campaign to persuade them to switch loyalty to you — maybe by offering a direct reward to those who do. Or if a lot of account holders have direct deposit set up elsewhere, you can create a strategy to remedy that problem. You can use the information however you see fit.

Having the ability to classify data enables you to outsmart the competition. If you don’t move on implementing a feature like this before your competitors, you risk being caught flat-footed as they develop this ability and start using it to their advantage. Again, the future of banking largely hinges on who can make the best use of data.


When every transaction is cleansed, categorized, and classified, you open up the way for a far better user experience. You also set the foundation to make better use of your transaction data for both your benefit and the benefit of your account holders. Of course, this is just one small piece of the data solution. There’s a lot more you can do to make better use of data to engage account holders. If you’re interested, you can read more about Insight and Target at MX.com/products.