Added GitHub Copilot Case Study

nitya · nitya · commit b1d01e865573 · 2023-10-09T10:33:33.000-04:00
diff --git a/4-prompt-engineering-fundamentals/4.0-introduction.md b/4-prompt-engineering-fundamentals/4.0-introduction.md
@@ -11,28 +11,27 @@ CODE CHALLENGE:
 If provided, should have an education focus - help show how the concepts can be applied to make the lives of teachers and students easier.
 -->
 
-So far, we've learned the following core concepts:
+
+## Introduction
+
+By now, you are familiar with these two terms:
  - **Generative AI** - is a category of artificial intelligence capable of _generating new content based on pre-trained models_ - in response to a natural language input or "prompt".
  - **Large Language Models (LLM)** - are a type of generative AI trained on massive quantities of text data to execute natural language processing (NLP) tasks at scale. 
- 
-We've heard of popular LLMs like [GPT-4](https://openai.com/gpt-4) (OpenAI), [BERT](https://github.com/google-research/bert) (Google) and [Llama-2](https://ai.meta.com/llama/) (Meta). And we've seen LLMs power _enterprise-grade applications_ like [GitHub Copilot](https://github.com/features/copilot) which is based on the [OpenAI Codex model](https://openai.com/blog/openai-codex) and built [in partnership with GitHub](https://github.blog/2023-05-17-inside-github-working-with-the-llms-behind-github-copilot/) to generate code and support other code-related tasks, driven by user prompts.
 
-However, the rapid growth and adoption of generative AI also identifies two key challenges:
- - **LLMs are _stochastic_ in nature.** The same prompt may have different outcomes with different LLMs - and may even generate slightly diffetent outcomes on repeating the prompt with the same LLM.
- - **LLMs can _hallucinate_ responses.** LLMs use "pre-trained models", limiting their core knowledge to the training data. Prompts that explore questions outside that scope (e.g., more recent) can result in unexpected responses that are inaccurate, confusing, or even contradictory to well-known facts.
- 
-This has led to a new field of technology focused on _guiding_ LLMs with more effective prompt design that can reduce or mitigate some of these effects. This new field is popularly known as _prompt enginering_.
+In this lesson, we're going to introduce a third term - **Prompt Engineering** - that reflects a new field of engineering focused on _more effective prompt design_ with tools and techniques to guide LLMs to deliver more relevant and consistent results for our generative AI applications.
+ - We'll define prompt engineering and motivate the need to design better prompts.
+ - We'll explore prompt usage in real-world examples to understand opportunities and limitations.
+ - We'll explore design techniques to help us iterate and validate prompts till they meet expectations.
 
-## Introduction
-
-In this lesson, we'll define prompt engineering in more detail and learn _why_ prompt engineering matters. Then, we'll explore _prompt examples_ in real-world applications to get a sense of how current LLM features and challenges influence responses. Finally, we'll discuss _prompt engineering techniques_ that can be used to improve the relevance and consistency of prompt-driven responses. We'll focus on basic techniques and best practices in this lesson - and set the stage for more _advanced_ techniques to be discussed in the next lesson.
 
 ## Learning Goals
 
 By the end of this lesson you will be able to:
- - Describe what prompt engineering is - and why it matters.
- - Discuss real-world examples of prompt usage - and identify relevant problems.
- - Apply popular prompt engineering techniques - and observe the impact on responses.
- - Learn to build & validate OpenAI prompts in a GitHub Codespaces enabled Jupyter Notebook.
+ - Describe Prompt Engineering - what it is, and why it matters to generative AI apps.
+ - Discuss Real-World Prompt Examples - illustrate their value and highlight their limitations.
+ - Apply Prompt-Engineering Techniques - to iterate & validate responses till desired criteria is met.
+ - Explore Prompt Engineering with OpenAI - using GitHub Codespaces, Jupyter Notebooks and OpenAI API key.
+
+This will set the stage for you to explore more _advanced engineering techniques_ in the next lesson. It should also help you **apply these learnings** to your real-world application by answering this question:
 
-More importantly, you should be able to _apply these learnings_ to your education AI startup and be able to answe the question: _How does prompt engineering help me deliver a better experience to students, educators, adminstrators and other users in my education startup_.
+> _How can better prompt engineering help me deliver an enhanced experience to students, educators, adminstrators and other user audiences, in my education startup_.
diff --git a/4-prompt-engineering-fundamentals/4.1-what-is-prompt-engineering.md b/4-prompt-engineering-fundamentals/4.1-what-is-prompt-engineering.md
@@ -8,21 +8,50 @@ Prompt Engineering.
 Define it and explain why it is needed.
 -->
 
-In this lesson, we'll try to answer two main questions:
- - What is Prompt Engineering?
- - Why do we need Prompt Engineering?
+In this lesson unit, we'll focus on answering two questions:
+ 1. What is Prompt Engineering?
+ 2. Why is Prompt Engineering needed?
+
+Let's dive in!
+
 
 ## 4.1.1 What is Prompt Engineering?
 
-Prompt engineering is the process of _designing and optimizing prompts_ for Generative AI models, for more relevant and reliable application experiences. It is an emerging field of study driven by the rapid adoption of LLMs in enterprise-scale applications. Prompt engineering has multiple facets:
- - Prompt _design_ which we can think of as the process of "writing good prompts" that guide the model to produce the desired output for our applications.
- - Prompt _optimization_ which we can think of as the process of "tuning prompts" to improve the model's performance over time.
+We know that Generative AI applications can create new kinds of content (text, images, audio, code and more) in response to a text input (question) from the user. This text input is called a **prompt** and prompt engineering is the **process of designing and optimizing prompts** for Generative AI applications.
+
+Prompts tell the Generative AI model what to do. Think of them almost like a set of _instructions or questions_ that you provide to as a rubric to guide the model towards more relevant and consistent responses. And, just like with any rubric, the _quality_ of the returned response depends on the _clarity and guidelines_ provided by the instructions.
+
+Think of prompt engineering as a 2-part process:
+ 1. **Prompt design** - the process of "writing a good first prompt" that provides core instructions to guide the model towards producing a _relevant_ response.
+ 2. **Prompt optimization** - the process of "tuning the prompt" with repeated iterations - validating results each time till the _quality_ of prompt responses meets desired expectations.
+
+The most important thing to remember is that prompt engineering is **more art than science**. Think of it as a trial-and-error process where you can learn and apply recommended techniques (some of which we'll cover in this lesson) first, to improve the quality of results. But you'll also bring your own application domain knowledge and intuition next, to refine the prompt further till you get the results you need.
 
 
 ## 4.1. 2 Why do we need Prompt Engineering?
-So why do we need prompt engineering? It turns out that LLMs are great at generating content, but they don't actually _understand_ the context or meaning of the content they generate. Some related challenges:
- - Model responses are _stochastic_. This means that the same model can produce different results for the same prompt - which creates user confusion.
- - Models can _hallucinate_ responses. This means models can produce responses that are inaccurate, nonsensical, or contradictory to facts - which dilutes user trust.
- - Model capabilities _vary_. Each model has its own quirks, causing the same prompt to be interpreted differently by different models - giving inconsistent user experiences.
- 
-Prompt engineering is a way to address these challenges, by guiding the model to produce more relevant and reliable results. It is about _optimizing a prompt for a given generative AI model and a desired application goal_ so that we deliver relevant and reliable experiences to our users.
+
+Just like with rubrics, prompts can benefit from an _iterate and validate_ process where we design the prompt, then see how well those instructions were understood by analyzing the responses, then refine the prompt and try again. Iterate till results are _closer_ to our expectations.
+
+But why do we need an entire **prompt engineering discipline** with tools, techniques and best practices, for use in generative AI applications? Shouldn't our intuition be enough? It's because LLMs are great at _generating content_ but are clueless about the _meaning_ of the content they just created. So they can't tell if the output was relevantand met your expectations for quality. 
+
+Here are some of the challenges that prompt engineering is trying to address better:
+
+1. **Model responses are stochastic.** The same model may produce different results for the same prompt input. This can lead to inconsistent user experiences in your generative AI apps, and have an impact on follow-up actions or workflows driven from it.
+1. **Models can hallucinate responses.** The model can return responses that are incorrect, imaginary, or contradictory to known facts. Because LLMs use _pre-trained models_ (based on massive but finite training data) they can lack knowledge about concepts outside that trained scope. And since they don't provide citations for their responses, we have no way of knowing if outputs were valid or not.
+1. **Models capabilities vary.** Want to generate a text summary of an article? There are many LLM options to pick from (GPT-4 from OpenAI, LLaMA-2 from Meta, BERT from Google etc.) - but each has its own quirks and features, leading to results that may vary in content relevance, quality, and format. To get consistent or higher-quality results, we need to _fine-tune_ usage to suit the model.
+
+Prompt engineering now becomes a tool and process we can apply, to **guide the model towards relevant, quality results** by applying best practices (e.g. for a given model) in combination with context and intuition (e.g., for a given application domain) to deliver more consistent results that will align better to both user and developer expectations for the generative AI application.
+
+
+
+## 4.1.3 Case Study: GitHub Copilot
+
+Want to get a sense of how prompt engineering was applied to build an enterprise-grade generative AI application using a familiar Large Language Model? Say hello to [GitHub Copilot](https://github.com/features/copilot) - your "AI Pair Programmer" that brings the power of chat-based suggestions and support right into your development environment in Visual Studio Code.
+
+- **May 2023** | Learn how GitHub engineers used [prompt engineering](https://github.blog/2023-05-17-how-github-copilot-is-getting-better-at-understanding-your-code/) to make the model provide _contextually relevant responses with low latency_.
+- **May 2023** | GitHub Copilot used the [OpenAI Codex model](https://openai.com/blog/openai-codex). Learn how engineers then [improved the model itself](https://github.blog/2023-05-17-inside-github-working-with-the-llms-behind-github-copilot/) to improve initial prompt-design, and subsequent prompt-tuning steps - with a focus on _code generation_ as the application domain context.
+- **Jun 2023** | GitHub developer advocates released their [How to use GitHub Copilot](https://github.blog/2023-06-20-how-to-write-better-prompts-for-github-copilot/) guide, providing prompt examples and prompt engineering best practices for devs.
+- **Jul 2023** | GitHub engineers release [A Developer's Guide to Prompt Engineering and LLMs](https://github.blog/2023-07-17-prompt-engineering-guide-generative-ai-llms/) that extends beyond basic prompt design to building "prompt engineering pipelines" (_gather context - snippeting - dressing them up - priortization - completion - stop criteria_) for enhanced developer workflows targeting generative AI applications.
+- **Sep 2023** | It took 3+ years for GitHub Copilot to go from idea to production. This [How to build an enterprise LLM app: Lessons from GitHub Copilot](https://github.blog/2023-09-06-how-to-build-an-enterprise-llm-application-lessons-from-github-copilot/) describes a broad strategy (_find it, nail it, scale it_) for building enterprise-grade solutions like the education-startup AI application we are currently discussing.
+
+Want to keep up with the GitHub Copilot team's learnings? Check out their [Engineering blog](https://github.blog/category/engineering/) for new posts like [this one](https://github.blog/2023-09-27-how-i-used-github-copilot-chat-to-build-a-reactjs-gallery-prototype/) which shows you more examples of usage that you can apply to your own projects.