|
| 1 | + |
| 2 | +# Analyze Scientific Papers Using ChatGPT™ Function Calls |
| 3 | + |
| 4 | +To run the code shown on this page, open the MLX file in MATLAB: [mlx-scripts/AnalyzeScientificPapersUsingFunctionCalls.mlx](mlx-scripts/AnalyzeScientificPapersUsingFunctionCalls.mlx) |
| 5 | + |
| 6 | +This example shows how to extract recent scientific papers from ArXiv, summarize them using ChatGPT, and write the results to a CSV file using the `openAIFunction` function. |
| 7 | + |
| 8 | +- The example contains three steps: |
| 9 | +- Define a custom function for ChatGPT to use to process its input and output. |
| 10 | +- Extract papers from ArXiv. |
| 11 | +- Use ChatGPT to assess whether a paper is relevant to your query, and to add an entry to the results table if so. |
| 12 | + |
| 13 | +To run this example, you need a valid API key from a paid OpenAI™ API account. |
| 14 | + |
| 15 | +```matlab |
| 16 | +loadenv(".env") |
| 17 | +addpath('../..') |
| 18 | +``` |
| 19 | +# Initialize OpenAI API Function and Chat |
| 20 | + |
| 21 | +Use `openAIFunction` to define functions that the model will be able to requests calls. |
| 22 | + |
| 23 | + |
| 24 | +Set up the function to store paper details and initiate a chat with the OpenAI API with a defined role as a scientific paper expert. |
| 25 | + |
| 26 | + |
| 27 | +Define the function that you want the model to have access to. In this example the used function is `writePaperDetails`. |
| 28 | + |
| 29 | +```matlab |
| 30 | +f = openAIFunction("writePaperDetails", "Function to write paper details to a table."); |
| 31 | +f = addParameter(f, "name", type="string", description="Name of the paper."); |
| 32 | +f = addParameter(f, "url", type="string", description="URL containing the paper."); |
| 33 | +f = addParameter(f, "explanation", type="string", description="Explanation on why the paper is related to the given topic."); |
| 34 | +
|
| 35 | +paperVerifier = openAIChat("You are an expert in filtering scientific papers. " + ... |
| 36 | + "Given a certain topic, you are able to decide if the paper" + ... |
| 37 | + " fits the given topic or not."); |
| 38 | +
|
| 39 | +paperExtractor = openAIChat("You are an expert in extracting information from a paper.", Tools=f); |
| 40 | +
|
| 41 | +function writePaperDetails(name, url, desc) |
| 42 | +filename = "papers_to_read.csv"; |
| 43 | +T = table(name, url, desc, VariableNames=["Name", "URL", "Description"]); |
| 44 | +writetable(T, filename, WriteMode="append"); |
| 45 | +end |
| 46 | +``` |
| 47 | +# Extract Papers From ArXiv |
| 48 | + |
| 49 | +Specify the category of interest, the date range for the query, and the maximum number of results to retrieve from the ArXiv API. |
| 50 | + |
| 51 | +```matlab |
| 52 | +category = "cs.CL"; |
| 53 | +endDate = datetime("today", "Format","uuuuMMdd"); |
| 54 | +startDate = datetime("today", "Format","uuuuMMdd") - 5; |
| 55 | +maxResults = 40; |
| 56 | +urlQuery = "https://export.arxiv.org/api/query?search_query=" + ... |
| 57 | + "cat:" + category + ... |
| 58 | + "&submittedDate=["+string(startDate)+"+TO+"+string(endDate)+"]"+... |
| 59 | + "&max_results=" + maxResults + ... |
| 60 | + "&sortBy=submittedDate&sortOrder=descending"; |
| 61 | +
|
| 62 | +options = weboptions('Timeout',160); |
| 63 | +code = webread(urlQuery,options); |
| 64 | +``` |
| 65 | + |
| 66 | +Extract individual paper entries from the API response and use ChatGPT to determine whether each paper is related to the specified topic. |
| 67 | + |
| 68 | + |
| 69 | +ChatGPT will parse the XML file, so we only need to extract the relevant entries. |
| 70 | + |
| 71 | +```matlab |
| 72 | +entries = extractBetween(code, '<entry>', '</entry>'); |
| 73 | +``` |
| 74 | +# Write Relevant Information to Table |
| 75 | + |
| 76 | +Create empty file and determine the topic of interest. |
| 77 | + |
| 78 | +```matlab |
| 79 | +filename = "papers_to_read.csv"; |
| 80 | +T = table([], [], [], VariableNames=["Name", "URL", "Description"]); |
| 81 | +writetable(T, filename); |
| 82 | +
|
| 83 | +topic = "Large Language Models"; |
| 84 | +``` |
| 85 | + |
| 86 | +Loop over the entries and see if they are relevant to the topic of interest. |
| 87 | + |
| 88 | +```matlab |
| 89 | +for i = 1:length(entries) |
| 90 | + prompt = "Given the following paper:" + newline +... |
| 91 | + string(entries{i})+ newline +... |
| 92 | + "Is it related to the topic: "+ topic +"?" + ... |
| 93 | + " Answer 'yes' or 'no'."; |
| 94 | + [text, response] = generate(paperVerifier, prompt); |
| 95 | +
|
| 96 | +``` |
| 97 | + |
| 98 | +If the model classifies this entry as relevant, then it tries to request a function call. |
| 99 | + |
| 100 | +```matlab |
| 101 | + if contains("yes", text, IgnoreCase=true) |
| 102 | + prompt = "Given the following paper:" + newline + string(entries{i})+ newline +... |
| 103 | + "Given the topic: "+ topic + newline + "Write the details to a table."; |
| 104 | + [text, response] = generate(paperExtractor, prompt); |
| 105 | +``` |
| 106 | + |
| 107 | +If `function_call` if part of the response, it means the model is requesting a function call. The function call request should contain the needed arguments to call the function specified at the end of this example and defined with `openAIFunctions`. |
| 108 | + |
| 109 | +```matlab |
| 110 | + if isfield(response, "tool_calls") |
| 111 | + funCall = response.tool_calls; |
| 112 | + functionCallAttempt(funCall); |
| 113 | + end |
| 114 | + end |
| 115 | +end |
| 116 | +``` |
| 117 | + |
| 118 | +Read the generated file. |
| 119 | + |
| 120 | +```matlab |
| 121 | +data = readtable("papers_to_read.csv", Delimiter=",") |
| 122 | +``` |
| 123 | +# Helper Function |
| 124 | + |
| 125 | +This function handles function call attempts from the model, checking the function name and arguments before calling the appropriate function to store the paper details. |
| 126 | + |
| 127 | +```matlab |
| 128 | +function functionCallAttempt(funCall) |
| 129 | +``` |
| 130 | + |
| 131 | +The model can sometimes hallucinate function names, so you need to ensure that it's suggesting the correct name. |
| 132 | + |
| 133 | +```matlab |
| 134 | +if funCall.function.name == "writePaperDetails" |
| 135 | + try |
| 136 | +``` |
| 137 | + |
| 138 | +The model can sometimes return improperly formed JSON, which needs to be handled. |
| 139 | + |
| 140 | +```matlab |
| 141 | + funArgs = jsondecode(funCall.function.arguments); |
| 142 | + catch ME |
| 143 | + error("Model returned improperly formed JSON."); |
| 144 | + end |
| 145 | +``` |
| 146 | + |
| 147 | +The model can hallucinate arguments. The code needs to ensure the arguments have been defined before calling the function. |
| 148 | + |
| 149 | +```matlab |
| 150 | + if isfield(funArgs, "name") && isfield(funArgs, "url") && isfield(funArgs,"explanation") |
| 151 | + writePaperDetails(string(funArgs.name), string(funArgs.url), string(funArgs.explanation)); |
| 152 | + end |
| 153 | +end |
| 154 | +end |
| 155 | +``` |
| 156 | + |
| 157 | +*Copyright 2023\-2024 The MathWorks, Inc.* |
| 158 | + |
0 commit comments