In this article, I will introduce a development process can make an incredible difference in individual and team productivity. By focusing on this concept, my team and I have greatly increased our speed of delivery, and we are learning faster. It’s an important mindset and focus that, although quite simple and basic, is often overlooked or down prioritized. I hope you will find it as useful as I have.
Below you can see the two development processes that will be introduces with more detail later in the article:
Some key takeaways from this article are:
- Inner Loop: Focuses on fast development and testing.
- Outer Loop: Automation for comprehensive integration testing, user acceptance testing, deployment, and continuous monitoring.
- AI Inner Loops: In plural yes, for AI projects it is important to have many inner loops for different purposes.
- AI Outer Loops: For AI projects, the outer loop is concerned with the entire pipeline from data collection to model deployment.
- Moving from Inner to Outer: Start from within, if the inner loop fails, the outer loop will also fail.
When developing AI projects, a well-suited and well-defined development process can greatly increase the speed of development and the quality of the product. The traditional software development loop is a well-known concept and is often described as having an inner and outer loop.
Small habits and practices are what differentiates the good from the great developers. It requires effort and time to change habits, but it is worth it.
We will start by looking at the traditional software development loop and then move onto the second part on what is different in AI projects and how to tweak the development loops to suit AI projects better.
The Traditional Software Development Loop
Let’s start by looking at the traditional software development loop. Data scientists and AI engineers are just a special kind of software developers, and if you are creating apps in production both the traditional software development loop and the AI development loop are important to understand and master.
The traditional software development loop consists of two main parts: the inner loop and the outer loop. The inner loop focuses on the development and testing of the code, while the outer loop focuses on the deployment and monitoring of the application.
It is important start from the inner loop(s) and move out to the outer loop. If the inner loop fails, the outer loop will also fail.
Inner Loop
The inner loop typically consists of the following steps:
- Code: Writing the actual code for the application.
- Test: Running tests to ensure the code works as expected.
- Debug: Identifying and fixing any issues or bugs in the code.
- Refactor: Improving the code structure without changing its functionality.
- Repeat: Repeating the cycle to continuously improve the code until it meets the required standards.
Tweaking the inner loops requires much skill, time, and effort, but it is an investment that will quickly pay off and make a huge difference in the quality and speed of the development.
This is even more important in AI projects, as we will see later in this article. The inner loop is where you spend most of your time, and it should be a good experience to work in this loop.
Outer Loop
Outer loops can be implemented in many different ways, but the following is the simplest version of the outer loop that I have found to work well. It can be expanded with more steps for improved quality assurance.
The outer loop includes the following steps:
- Automated QA Pipeline: Running automated quality assurance tests to ensure the code meets the required standards. This can include static code analysis, code formatting checks, unit tests, and more.
- Deploy @Sand: Deploying the code to a sandbox environment for further testing.
- Automated System Testing: Running automated tests (for instance smoke tests) to check the basic functionality of the application.
- PR Code Review: Conducting a peer review of the code changes through pull requests.
- Deploy @Test: Deploying the code to a test environment for more comprehensive testing.
- Manual End2End Testing: Performing manual end-to-end testing to ensure the application works as expected from start to finish.
- Deploy @Prod: Deploying the code to the production environment.
- Monitor: Monitoring the application in production to ensure it runs smoothly and to identify any issues.
- Repeat: We move back to the inner loop if any issues are found, and repeat the cycle.
This loop ensures that the software is developed, tested, and deployed in a systematic and efficient manner, leading to higher quality and more reliable applications. Because much of the process is automated it ensures you can often deploy and feel confident that the code works as expected.
Tweaking the development loops for AI projects
The AI system is a bit different from traditional software development. The AI system is a complex system that requires a different approach to development.
Inner loops for AI projects
In AI projects, I prefer to have many inner loops of different cycle times. Examples of inner loops are:
- Data preprocessing loop including data cleaning, feature engineering, and data augmentation. For faster development work on a small subset of data to confirm the code works, then scale it up to the full dataset.
- Coding loop including linting, type checking, and unit testing
- Model behavioural testing loop. It can be a good idea to have local scripts that test certain aspects of the model. For an LLM RAG system, it could be how good is the retrieval, how good is table reading, how good is it at rewriting questions based on the chat history, etc.
- Model evaluation loop measuring accuracy, precision, recall, F1 score, etc. This is a longer loop that gives the overall performance and should be run on the cloud where the results are stored to ensure lineage, reproducibility, and have a history of the model performance. This will make it easier for team members to understand what has been done and why.
There should be a focus on optimizing the inner loops to cover important aspects of the AI system and provide quick feedback.
Keeping the loops small is especially important as the size of the project increases.
It is important to have many different loops in a project to test different levels and aspects of the AI system. For a development task, you should start with the fastest loops and move upwards to the slower loops.
Ways to have really fast inner loops:
- Use debugging tools, like
pdb
or a breakpoint in your IDE, then use the interactive debug console to write new code, or copy the failing code into the console to iterate fast on different solutions. Use thedir()
,help()
on objects to understand them better, or theinspect
module for more advanced introspection. - Split your code into small functions and test them individually. Be really good at writing unit tests, mocking out dependencies, managing test data and using fixtures.
- Use test-driven development (TDD) principles to get quick feedback on your code
- Become good at using generative AI tools build into the IDE (for instance GitHub Copilot or the Cursor IDE)
- Use a Jupyter notebook to avoid the overhead of running the whole script (not recommended)
- Develop a small script that can be run from the command line to test or validate a certain behaviour of the code. But make sure to use unit and integration tests where suitable.
Principle: Select the smallest loop that can give you the information you need or test the code you are working on.
For testing, if code fails at the quick loops, it will also fail at the slow inner loops and fail in the outer loop.
Again, the best is to be good at all of these disciplines and use the right tool for the right job.
I have developed a Python package called snappylapy, a tool that can help to easily increase the speed of the inner loops. It makes it really easy to manage data in unit tests when splitting a script into separate steps that can be tested individually. It solves the dilemma of needing tests to be independent, but also test the integration of the different steps, and keeping test data up-to-date. It handles snapshot testing, fuzzy expectations and much more. Read more about it here.
I encourage you to think about how to tweak the inner loops to be faster. It takes practice and requires changes in development habits, but trust me - it is worth it.
The AI outer loop for ML Model Operations
The outer loop for AI projects is also a bit different from traditional software development.
For the MLOps model there are three categories of steps: the pipeline steps, the data steps and the ML steps. The pipeline steps is concerned with deploying a pipeline that can be used for training and evaluating the AI model. The pipeline can be configured for continuous training, where new model are train as new data comes in. The data steps are concerned with gathering raw data from various sources that will be used for training and evaluating the AI model. The ML steps are for training the AI model, adjusting its parameters to improve its performance.
The outer loop can consists of the following steps:
- Pipeline Validation: Ensuring that the entire pipeline, from data collection to model deployment, works as expected and meets the required standards.
- Pipeline Deployment: Deploying the validated pipeline to a production environment where it can be used for real-world data processing and model training.
- Data Collection: Gathering raw data from various sources that will be used for training and evaluating the AI model.
- Data Curation: Cleaning and organizing the collected data to ensure it is of high quality and suitable for training the model.
- Data Transformation: Converting the curated data into a format that can be used by the AI model, which may include normalization, encoding, and feature extraction.
- Data Validation: Verifying that the transformed data is accurate, complete, and suitable for training the model.
- Model Training: Using the validated data to train the AI model, adjusting its parameters to improve its performance.
- Model Evaluation: Assessing the trained model’s performance using various metrics such as accuracy, precision, recall, and F1 score to ensure it meets the desired criteria. Also ran in an inner loop, but here we want to store a versioned benchmark, preferably on a different dataset than used in the inner loop, to ensure the model generalizes well.
- Model Deployment: Deploying the trained and evaluated model to a production environment where it can be used to make predictions on new data.
- Model Monitoring: Continuously monitoring the deployed model’s performance in the production environment to ensure it remains accurate and reliable.
- Model Feedback: Collecting feedback from the model’s performance in the real world and using it to retrain or improve the AI system, ensuring it adapts to new data and scenarios.
The outer loop ensures that the AI system is developed, tested, and deployed in a systematic and efficient manner, leading to higher quality and more reliable applications.
In summary, understanding and optimizing both the inner and outer development loops is crucial for the success of AI projects. By focusing on fast, iterative inner loops and robust, automated outer loops, you can significantly enhance the quality and speed of your development process. Remember, the key is to start small, iterate quickly, and continuously improve. Whether you’re working on traditional software or cutting-edge AI systems, these principles will help you deliver better results more efficiently. Happy coding!