As TextCrimes.com is being developed part-time by student interns our development cycle is slow but we are actively working on the following areas.
- Within TextCrimes.com users will be able to contribute their own collections of malicious communications which they can tag using provided or user-defined tag sets.
- The process for adding these new data Collections will include a quality assurance process to ensure that where texts are digitised this is done accurately from the original image file.
- Collection owners will be able to choose to publish their data collections to other registered users but have full control over who can view the open or anonymised versions of texts.
- Collection owners will also be able to allow their data to contribute to background analyses of base-rate information without giving other users permission to read the texts at all.
Filters, searching and analysis
- Users will be able to create their own query sets of documents by filtering data within or between collections using the tags. For example, a query set might be all documents from available collections that contains conditional threats, by writers known to be men.
- Once a query set has been defined, users will be able to search the texts of documents by either searching for specified word strings or by using Regex strings. Results of these searches will initially be displayed as Key Word in Context (KWIC) lines.
- We already have the capacity to tag text strings within texts (e.g. as threats or abuse) and we are looking to develop this "highlight tags" facility further to facilitate more sophisticated searches.
- A further distant analysis goal will be to allow users to write and share their own scripts using R to run analyses on the query sets.