Journey in Rust: Bending ChatGPT to our will, Custom Format response generation.

Journey in Rust: Bending ChatGPT to our will, Custom Format response generation.

Getting the exact response you want from ChatGPT.

ยท

10 min read

Hey there, fellow Rustaceans! Welcome back to our thrilling journey of learning Rust and creating an open-source tool. In today's episode, we have a lot to cover and discover a whole new way to tackle GPT. So buckle up, and let's get started! If you just want to learn Prompt Tuning go to Change of plans in Table of content.

Environment Variables to the Rescue!

Remember the last time we played around with tokens and commands? Well, things are about to get even more exciting. We'll now be using environment variables to store some of these values. This way, we can set them externally and make our code more flexible. Here's how we did it:

if manual_commands.is_some() {
    env::set_var(
        "commands",
        manual_commands.expect("manual_commands is None"),
    );
}
if auto_commands {
    let commands = get_installed_commands();
    env::set_var("commands", commands);
}

Now, if manual_commands is not None, we store it in the commands environment variable. And if auto_commands is true, we first fetch the installed commands using get_installed_commands(), and then store that in the commands environment variable as well.

Note: This is just a starting point - we're still working on the actual implementation of the get_installed_commands() function, which currently returns an empty string. Hang tight, though, as we'll flesh it out in later episodes!

Functions Galore: Getting Environment Variables and Defaults

We've added a few new functions to handle environment variables and provide default values. Let's check them out!

get_installed_commands()

This function currently returns an empty string, but we'll be implementing it soon enough. For now, we've added it to our code to set up the structure for when we're ready to dive into the world of installed commands.

fn get_installed_commands() -> String {
    "".to_owned()
}

get_commands()

This function fetches the value of the commands environment variable. If it's not set, it returns an empty string.

fn get_commands() -> String {
    env::var("commands").unwrap_or("".to_owned())
}

get_default_tokens()

This function fetches the value of the tokens environment variable. If it's not set or can't be parsed as a number, it returns a default value of 200.

fn get_default_tokens() -> u32 {
    env::var("tokens")
        .unwrap_or("200".to_owned())
        .parse()
        .expect("tokens should be a number")
}

And that's a wrap! Today, we ventured into the realm of environment variables and added new functions to our code. As always, stay tuned for more fun and learning in our next episode. Until then, happy coding! ๐Ÿš€


First off, we've got some new code to deal with in our main function:

We are handling the case for which the user wants to config a default value for the number of tokens.

if tokens.is_some() {
    OpenOptions::new()
    .append(true)
    .open(".env")?
    .write_all(format!("TOKENS={}", tokens.unwrap()).as_bytes())?;
}

What's happening here? Well, we're checking if the tokens variable has a value (i.e., it's not None). If it does, we're setting an environment variable named "tokens" to the value of tokens. This way, we can easily access the value of tokens throughout our program without needing to pass it around as a parameter.

Remember how we talked about environment variables being like a backpack you can carry around and access whenever you need something? Well, this is us putting the token's value into our backpack! ๐ŸŽ’

Step 2: Using Default Values

Next up, we've made a tiny change to the Commands::Search block:

let tokens = tokens.unwrap_or(get_default_tokens());

Previously, we had a hardcoded default value of 200 for tokens. Now, we've switched it up and are using a function called get_default_tokens() to fetch the default value. This makes the code more flexible and easier to maintain since we can change the default value in one place (the get_default_tokens() function) instead of searching for all instances of the hardcoded value.

Think of it like having a favorite ice cream flavor ๐Ÿฆ. Instead of telling everyone, "I love chocolate!" each time you talk about ice cream, you can just say, "I love my favorite flavor!" and have a function that returns "chocolate" as your favorite flavor. Easy peasy!


Change of plans

Earlier I had decided that for more context I can take all the commands installed in the user's machine and then use that as a default value for the commands variable. But I have decided to change that.

Why?
Well as it turned out my system has 4000+ external installed commands given by compgen leading to around 6000+ tokens that are expensive and beyond GPT3.5's limit. So here's what I am going to do instead.

NEW Plan:

What I can instead do is give up a prompt that allows it to use external commands, and get a separate list of all external commands needed for the task. Now this could have been it, but I have decided to take it a step further. We can check if the command exists on the user's machine. And only ask them to install if it's not installed.

What do we need now?

  1. Instruction Commands

  2. External Commands

  3. Description of the task

Well if the command is not installed we will also need a command to install it, okay we can ask GPT to generate that as well for us +4. Install Command.

Here is what I want the explanation to look like:

PartDescription
Commandffmpeg -i input_video.mp4 -vn -acodec copy output_audio.m4a
ffmpegA command-line tool used for handling audio, video, and other multimediafiles.
-i input_video.mp4Specifies the input video file.
-acodec copyCopies the audio stream from the input file to the output file without any re-encoding.
output_audio.m4aSpecifies the output audio file. The format of the output audio file is determined by its extension, which in this case is .m4a.

Plan of action:

New prompt:

So what we had as prompt right now was: "Act as a terminal expert, the answer should be the COMMAND ONLY, no need to explain. OS: {OS}"

What do we want? Maybe a reply in the JSON format that gives us all we need so that we can easily achieve what we want but how???

Well as it turns out you can assign CHATGPT roles and ask it to act like something so this is what I am going to do I will ask it to act as a API Server and also provide example responses and Format. Here is what we have got

Act as a smart terminal assistant API server, provide   help with general tasks you are allowed to use external dependencies, here is the example output format, output should be in JSON:\n
    {
    "instruction_commands": ["ffmpeg -i input_video.mp4 -vn -acodec copy output_audio.m4a"],
    "external_commands": ["ffmpeg"],
    "external_install": ["sudo apt install ffmpeg"],
    "explanation":" 
    | Part | Description | 
    | --- | --- | 
    | Command | ffmpeg -i input_video.mp4 -vn -acodec copy output_audio.m4a | 
    | ffmpeg | A command-line tool used for handling audio, video, and other multimedia files. | 
    | -i input_video.mp4 | Specifies the input video file. | | -vn | Disables the video stream from the output. | 
    | -acodec copy | Copies the audio stream from the input file to the output file without any re-encoding. |
    | output_audio.m4a | Specifies the output audio file. The format of the output audio file is determined by its extension, which in this case is .m4a. |"
    }

    The output should be a JSON object with the following fields:
    instruction_commands: A list of commands that can be run to complete the task.
    external_commands: A list of commands that are not built-in to the terminal, but are required to complete the task.
    external_install: A list of commands that can be run to install the external commands.
    explanation: A markdown table that explains the commands and their arguments.

    Here is your first task:

Even if this is not deterministic but we can work with this and get good results.


Step 4: Adding Serialization and Updating the Query

In this step, we are going to add Serialize to the Instructions struct and make some changes to the query. Let's start by adding the Serialize traits to the Instructions struct:

#[derive(Debug, Deserialize, Serialize)]
struct Instructions {
    instruction_commands: Vec<String>,
    external_commands: Vec<String>,
    external_install: Vec<String>,
    explanation: String,
}

By doing this, we can now serialize the Instructions struct, which will be useful later on when we want to display the data in a more user-friendly format.

Next, we are going to update the query in the Commands::Search arm. Before, the query string was hard coded to include the system message and the user's query. Now, we're going to make it more dynamic and include a JSON message format:

query = format!(
    "{system}{query} using terminal OS: {OS}",
    system = get_system_message(),
    query = query,
    OS = get_os()
);

This updated query format will make it easier for us to extract and display the relevant information to the user later on. The get_system_message function has also been updated with a more detailed message format.

Finally, we are going to update the ApiResponse handling in the Commands::Search arm. Instead of just printing the content of the first choice, we will now deserialize the message content into the Instructions struct and print the first item of instruction_commands:

let command: Instructions =
    serde_json::from_str(&response.choices[0].message.content)
        .expect("Failed to parse response");

println!("{:#?}", command.instruction_commands[0]);

This will give us a better view of the command that the user should execute.

Additional info:

The reason behind these changes is to make the code more adaptable and improve the user experience. By using a more detailed system message and a more structured query format, we can provide better instructions to the user. Additionally, adding the Serialize traits to the Instructions struct will allow us to display the data in a more user-friendly format later on.


Step 5: Handling External Commands

In this step, we are going to handle the external commands that are required to run the instruction commands. We'll make sure that the user has the necessary tools installed and provide them with the installation commands if they don't.

First, we'll add a new function handle_external_commands:

fn handle_external_commands(command: &Instructions) {
    // (implementation details will be explained below)
}

This function will take a reference to the Instructions struct and check if the required external commands are installed on the user's system. If not, it will print the list of installation commands.

We'll now update the Commands::Search arm to use the handle_external_commands function. We are also adding error handling for the deserialization of the Instructions struct:

let command: Result<Instructions, serde_json::Error> =
    serde_json::from_str(&response.choices[0].message.content);

// match the command and if error is found send request again
match command {
    Ok(command) => {
        handle_external_commands(&command);
        println!("{}", command.instruction_commands[0]);
    }
    Err(_) => {
        let response: ApiResponse = get_response(query, tokens).await?;
        let command: Instructions =
            serde_json::from_str(&response.choices[0].message.content).expect(
                "Error in parsing the response, Please try again with a different query",
            );
        handle_external_commands(&command);
        println!("{}", command.instruction_commands[0]);
    }
}

Now let's dive into the implementation of the handle_external_commands function. We will iterate through the external_commands and check if they are installed using the which command. If not, we will print the corresponding installation commands:

fn handle_external_commands(command: &Instructions) {
    let mut found_one = false;
    command
        .external_commands
        .iter()
        .enumerate()
        .for_each(|(index, tool)| {
            let output = Command::new("which").arg(tool.trim()).output();
            match output {
                Ok(output) => {
                    if !output.status.success() {
                        if !found_one {
                            println!("Run the following commands to install the required tools:");
                            found_one = true;
                        }
                        println!("{}", command.external_install[index].to_string())
                    }
                }
                Err(_) => {}
            }
        });
}

The handle_external_commands function starts by initializing a found_one variable to false. This variable will be used to print the "Run the following commands to install the required tools:" message only once.

Then, it iterates through the external_commands using enumerate and for_each. For each command, it checks if the command is installed using the which command. If the command is not installed, it prints the corresponding installation command from the external_install vector.


Additional info:

These changes will greatly improve the user experience by ensuring that the user has all the necessary tools installed on their system. By providing the installation commands, the user can quickly and easily install the required tools and run the instruction commands without any issues.

Fun fact: Did you know that Rust's mascot is named Ferris? They're super cute crab who loves to help and learn, just like us! ๐Ÿฆ€

Cover: Bhupesh

๐Ÿ”— Repository: termoil

ย