We’re hard at work on Project Cyborg, our DevOps bot designed to enhance our team to provide 10x the DevOps services per person. Building a bot like this takes a lot of pieces working in concert. To that end, we need a step in our chain to classify requests: does a query need to go to our Containerization model or our Security model? The solution: classification. A model that can figure out what kind of prompt it has been given. Then, we can pass along the prompt to the correct model. To test out the options on OpenAI for classification, I trained a model to determine if news articles would be relevant to our business or not.
I started by pulling down the articles from Google News.
This way, I can pull down a list of Google News articles with a certain search term within a date range. Google News does not do a good job of staying on topic by itself.
So, once I have this list of articles with full-text and summaries, I loaded them into a dataframe using Pandas and output that to an Excel sheet.
for i in range(2,20):
# Grab the page info for each article
# Create a Pandas Dataframe to store the article
Fine-Tuning for Classification
Then comes the human effort. I need to teach the bot what articles I consider relevant to our business. So, I took the Excel sheet and added another column, Relevancy.
I then manually ran down a lot of articles, looked at titles, summaries and sometimes the full text, and marked them as relevant or irrelevant.
Then, I took the information I have for each article, title, summary, and full-text, and combined them into one column. They form the prompt for the fine tuning. Then, the completion is taken from the relevancy column. I put these two columns into a csv file. This will be the training set for our fine-tuned model.
I got out our training dataset and our validation dataset. With that in hand, it was time to train a model. I selected Ada, the least-advanced GPT-3 model available. It’s not close to ChatGPT, but it is good for simple things like classification. A few cents and half an hour later, I have a fine-tuned model.
I can now integrate the fine-tuned model into my Google News scraping app. Now, it can pull down articles from a search term, and automatically determine if they are relevant or not. The relevant ones go into a spreadsheet to be viewed later. The app dynamically builds prompts that match the training data, and so I end up with a spreadsheet with only relevant articles.
We’ve been trying a lot of different things in Project Cyborg, our quest to create the DevOps bot. The technology around AI is complicated and evolving quickly. Once you move away from Chat Bots and start making more complicated things, like working with embeddings and agents, you have to hold a lot of information in… Read more: Visual Prompting: LLMs vs. Image Generation
Working with LLMs is complicated. For simple setups, like general purpose chatbots (ChatGPT), or classification, you have few moving pieces. But when it’s time to get serious work done, you have to coax your model into doing a lot more. We’re working on Project Cyborg, a DevOps bot that can identify security flaws, identify cost-savings… Read more: How to take the brain out of the box: AI Agents
AI embeddings are powerful. We’re working on Project Cyborg, a project to create a DevOps bot. There’s a lot of steps to get there. Our bot should be able to analyze real-world systems and find our where we could implement best practices. It should be able to look at security systems and cloud deployments to… Read more: What does AI Embedding have to do with Devops?
AI (Artificial Intelligence) is a rapidly advancing technology that has the potential to revolutionize a wide range of industries, from healthcare to finance to manufacturing. While some people may view AI as a toy or a gimmick, it is actually a foundational technology that is already transforming the world in significant ways. AI is foundational… Read more: Take AI Seriously: It is Foundational
Classification We’re hard at work on Project Cyborg, our DevOps bot designed to enhance our team to provide 10x the DevOps services per person. Building a bot like this takes a lot of pieces working in concert. To that end, we need a step in our chain to classify requests: does a query need to… Read more: Using Classification to Create an AI Bot to Scrape the News