You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: ai-data/managed-inference/concepts.mdx
+4-4
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@ categories:
13
13
---
14
14
## Allowed IPs
15
15
16
-
Allowed IPs are single IPs or IP blocks which have the [required permissions to remotely access a deployment](/ai-data/managed-inference/how-to/manage-allowed-ips/). They allow you to define which host and networks can connect to your Managed Inference endpoints. You can add, edit, or delete allowed IPs. In the absence of allowed IPs, all IP addresses are allowed by default.
16
+
Allowed IPs are single IPs or IP blocks that have the [required permissions to remotely access a deployment](/ai-data/managed-inference/how-to/manage-allowed-ips/). They allow you to define which host and networks can connect to your Managed Inference endpoints. You can add, edit, or delete allowed IPs. In the absence of allowed IPs, all IP addresses are allowed by default.
17
17
18
18
Access control is handled directly at the network level by Load Balancers, making the filtering more efficient and universal and relieving the Managed Inference server from this task.
19
19
@@ -27,7 +27,7 @@ A deployment makes a trained language model available for real-world application
27
27
28
28
## Embedding models
29
29
30
-
Embedding models are a representation-learning technique that converts textual data into numerical vectors. These vectors capture semantic information about the text, and are often used as input to downstream machine-learning models, or algorithms.
30
+
Embedding models are a representation-learning technique that converts textual data into numerical vectors. These vectors capture semantic information about the text and are often used as input to downstream machine-learning models, or algorithms.
31
31
32
32
## Endpoint
33
33
@@ -65,7 +65,7 @@ LLMs have applications in natural language processing, text generation, translat
65
65
66
66
In the context of LLMs, a prompt refers to the input provided to the model to generate a desired response.
67
67
It typically consists of a sentence, paragraph, or series of keywords or instructions that guide the model in producing text relevant to the given context or task.
68
-
The quality and specificity of the prompt greatly influences the generated output, as the model uses it to understand the user's intent and create responses accordingly.
68
+
The quality and specificity of the prompt greatly influence the generated output, as the model uses it to understand the user's intent and create responses accordingly.
69
69
70
70
## Quantization
71
71
@@ -74,4 +74,4 @@ LLMs provided for deployment are named with suffixes that denote their quantizat
74
74
75
75
## Retrieval Augmented Generation (RAG)
76
76
77
-
RAG is an architecture combining information retrieval elements with language generation to enhance the capabilities of LLMs. It involves retrieving relevant context or knowledge from external sources, and incorporating it into the generation process to produce more informative and contextually grounded outputs.
77
+
RAG is an architecture combining information retrieval elements with language generation to enhance the capabilities of LLMs. It involves retrieving relevant context or knowledge from external sources and incorporating it into the generation process to produce more informative and contextually grounded outputs.
0 commit comments