- Created the PoC to score the severity of the crime metric using ChatGPT
- Provided the customer with a solution architecture and defined technical requirements
- Built a breakthrough solution to help attorneys optimize work processes
- Enabled automatic filling in the Inmate Load and Security Designation Form via machine learning
- Wrapped the model as an application for better usability
About the Project
Our partner (under NDA) is an expert firm specializing in criminal matters and sentence mitigation strategies. The company works side by side with federal public defenders, law firms, and private attorneys to help them navigate through and interpret the Federal Bureau of Prisons (BOP) policy.
Challenges & Project Goals:
The customer came to us to validate the product idea for inmate security designation and custody classification through the use of ChatGPT. In theory, this solution should have helped attorneys and lawyers optimize work processes and significantly reduce time spent on filling in the Inmate Load and Security Designation Form. The latter is used for inmate classification enforced by BOP to classify the severity of crimes and assign prisoners to one of five custodial levels (minimum, low, medium, high, and the greatest) at the pre-sentence stage.
Our fruitful cooperation with the customer resulted in the PoC — a machine learning model designed to optimize the efficiency of composing statements. Wrapped as an application, the model is to assist attorneys to score the severity of the crime metric based on the uploaded pre-sentence reports and according to the Inmate Security Designation and Custody Classification policy.
Business Value Delivered:
Together with the customer, we built the PoC for the security classification and designation solution in federal criminal matters. The solution is designed to simplify processes for private attorneys and federal public defenders who spent lots of time filling in the Inmate Load and Security Designation Form manually. When transformed into a large-scale solution, the model has a high chance of becoming a technology breakthrough, having no equivalents on the legal market.
The company reached out to us with a fresh product idea to validate and long-lasting experience in federal criminal justice. Due to the absent technology background, the customer still didn’t know how and whether it was possible to turn their idea into a reality.
After a series of interviews, we provided the customer with a solution architecture, as well as a roadmap on how to build the PoC for the security classification/designation solution. The Intelliarts team also helped to define specific technical requirements, estimate the effort required, and identify the phases of development.
PoC Development Stage
At the data collection stage, the customer provided us with approximately 20 documents covering the person’s identifying and security designation data and the Inmate Load and Security Designation Form, which the model had to fill in.
From the very beginning, there were some challenges we had to deal with regarding the provided dataset. For example:
- Even though the type of the document was the same, its format could vary — the same section like age could be located differently.
- The form included both simple questions like name or date of birth and more complex ones like the severity of drug-related crime. The latter was rather subjective, and to answer this question, our data scientists had to consult a so-called offense severity scale, which was covered in detail by the Inmate Security Designation and Custody Classification policy.
- The documents were uncopyable, which complicated feeding the model with the data.
- The dataset was rather a small one, with only a few correct labels, so the team risked overfitting while training the model.
Some of these problems our data scientists could solve at the data preparation and processing stage. For example, we made it possible to copy the text by consulting the attorney which parts in the documents were the most important for them to build this PoC. Then, the team extracted these texts with a pure programming approach via the use of Python.
For the identifying data, our tech experts built a one-chunk-based system with the chunks of data encoded, which had to secure the sensitive data. This system helped us to extract the data more easily to feed the model.
At the same time, there were challenges like the limited dataset that we had to reconcile with. Fortunately, this wasn’t that critical since we were developing the PoC only to show the investors.
Next, our ML engineers worked with ChatGPT directly by uploading the documents and the classification form to it. We didn’t meet any challenges to fill in the form with identifying data after finding the way to extract the data. However, the training process was complicated when we moved to security designation data. Here we had to experiment with different prompts to receive the correct score for the severity of drug-related crimes. (We chose only this type of crime for the PoC.) Since it was a more abstract question, our team wrote and edited the prompts multiple times to boost the model performance.
This was also the step of close partnership with attorneys. After each iteration, we had to set up a meeting with the customer to go through the model results and clarify the details that required domain knowledge. For example, there was information about a large-scale drug activity that impacted the score. The expert explained the abstract concept of “large” as “being a drug leader, with 4-5 people under supervision”. Overall, tight cooperation with the customer was the particularity of this whole project since we needed specific knowledge of the US criminal justice matters.
After the empirical validation of the model results, our ML engineers deployed the model and wrapped it up as an application for better usability. Now the customer only submits the document with the inmate’s personal information, and on the way out, they get the form filled in with identifying data and security designation data (severity of drug-related crimes as for now).
The project is currently on hold while the customer is showing the application to the investors. Using ChatGPT, the Intelliarts team created the PoC for the security classification/designation solution and achieved the next business outcomes:
- The solution greatly automates and optimizes the daily operations of attorneys and lawyers who spent lots of working hours filling in the Inmate Load and Security Designation Form manually. It reduces the time spent on the research processes, and lawyers as the end-users could focus more on the trial instead of these repetitive jobs.
- Since the legal market gravitates towards tradition and conservatism, the solution is likely to be a novelty and very successful, with no to few rivals to compete with.
- Currently, the PoC model provides 100% accurate results, and the Intelliarts team is determined to get the same accuracy when we move to building the full-scale security classification/designation solution.
- The system is for drug-related offenses only for now. But we created it in a way to make it easily extendable.