Proximity, fuzzy and wildcard – Getting to know Global Search
Investigators today must manage ever-growing amounts of information as part of their investigations. From bank statements, emails, social media profiles, witness statements, device downloads, addresses, and registration plates; data volumes are vast, formats are manifold and, often, data quality and consistency can be lacking.
Meanwhile, with continued rapid digitalisation and advances in interoperability between databases, systems and organisations, data sources are multiplying every year.
This wealth of information is hugely valuable but as it continues to proliferate, it’s crucial that digital investigators can locate, prioritise and retrieve the information they seek. As is so common in investigations, that should be the case even when the information they are searching for, or with, is incomplete.
To this end, over several years, Clue has been continually developing Global Search to ensure it provides investigators with the best chance of finding evidence and building their cases.
This powerful search capability greatly enhances our users’ ability to surface insights from data by returning query results in order of relevance, in context, and with links to other entities clearly highlighted – with relationships between information just as important as the information itself.
Getting to know Global Search
Clue’s Global Search feature is available to all users and can be used to locate and access information that has entered the system. It searches all registers, records, and attachments that an individual user has access to, or can be filtered to exclude selected registers.
Like most search engines, if a user searches a single word term (such as Ferrari), Global Search will return hits where that word appears.
For two or more words, such as Red Ferrari, the user will see all hits where both those words appear anywhere in a record. Or for an exact phrase the user can enclose their phrase in quotation marks: “Red Ferrari”.
Users can also incorporate the following logic operators into any free text search:
- OR will return records containing either term.
- NOT will exclude terms.
- Brackets specify the order in which terms are interpreted. Information within brackets is read first, then information outside the brackets.
However, where Global Search gets clever is in its ability to return relevant suggested results, even when presented with partial information.
With the introduction of Clue, all the The League Against Cruel Sports’ intelligence is now centralised and searchable in real time. Searching is fast and asking questions of the data is simple for investigators. Often the intelligence team don’t yet know exactly what they are looking for. Using Clue, they can iteratively layer and refine searches to uncover key information.
Read more about how The League Against Cruel Sports is using Global Search here.
Fuzzy (approximate) searches
Approximate searches, more commonly known as fuzzy searches, enable users to find words that may have been spelt incorrectly, or when the user is not sure of the spelling. Fuzzy searching can be much more powerful than exact searching, enabling investigators to locate information or individuals based on incomplete or partially inaccurate identifying information. A suspect might have provided a misspelling of their name to deliberately confuse an investigation, for example, or the correct spelling of a name may not be known.
By adding a tilde (~) and a number to the end of a search term, Clue’s Global Search can find words that are up to two single character edits away from the search term, for example:
Sinclare~2 will return results that are one character different, such as where someone has misspelled the name Sinclair
Fuzzy searches also return words that are one or two characters less than the word specified. For example, Jackson~2 will also return Jason and Jacks.
Wildcard searching provides another powerful tool for searches using incomplete information, enabling users to search using parts of words or alphanumeric strings.
This can be used in investigations where only partial information is known or has been reported, such as the first four digits of a car’s registration plate. In this case, by using an *, any text strings containing the digits would be returned, for example:
SP16* would return any text strings beginning with SP16 including the registration of a recently reported vehicle SP16RYO.
Meanwhile, using a ? where certain digits are unknown within a text string, such as SP16R??, will return all text strings beginning with SPI6R followed by TWO characters.
The International Tennis Integrity Agency (ITIA) obtains and imports high volumes of relevant data, including betting and social data which is frequently imported. Clue enables efficient and focused search, revealing relevant intelligence that may otherwise have stayed buried. Clue’s advanced analysis features save the ITIA hours of work every month, with scrutiny to a level not previously possible.
Learn how the ITIA is using Global Search here.
Proximity (Near) searches
Proximity or near searches help investigators find two pieces of information that should be found close to one another, within the same field or attachment. Clue users can run proximity searches by adding a tilde (~) and a number to the end of a search term, for example:
“John Smith” ~3 would find instances of John within three words of Smith:
- John Smith
- John word1 Smith
- John word1 word2 Smith
- John word1 word2 word3 Smith
To learn more about how Clue’s Global Search could help to enhance and optimise your investigations and intelligence management or to book a demo, get in touch with our team today.
On October 5th, join sports law guru Richard McLaren and leaders in anti-doping and learn how to maximise technology for an intelligence-led approach to testing.