Jina Reader for Search Grounding to Improve Factuality of LLMs
Grounding is essential for GenAI apps. Our new https://s.jina.ai/ allows LLMs to access the latest knowledge from the web, enabling search grounding and making responses more trustworthy.
Grounding is absolutely essential for GenAI applications.
You have probably seen many tools, prompts, and RAG pipelines designed to improve the factuality of LLMs since 2023. Why? Because the primary barrier preventing enterprises from deploying LLMs to millions of users is the trust: Is the answer genuine, or is it a mere hallucination from the model? This is an industry-wide problem, and Jina AI has been working very hard to solve it. Today, with the new Jina Reader search grounding feature, you can simply use https://s.jina.ai/YOUR_SEARCH_QUERY
to search the latest world-knowledge from the web. With this, you are one step closer to improving the factuality of LLMs, making their responses more trustworthy and helpful.
The Factuality Problem of LLMs
We all know LLMs can make things up and harm user trust. LLMs may say things that are not factual (aka hallucinate), especially regarding topics they didn't learn about during training. This could be either new information created since training or niche knowledge that has been "marginalized" during training.
As a result, when it comes to questions like "What's the weather today?" or "Who won the Oscar for Best Actress this year?" the model will either respond with "I don't know" or give you outdated information.
How Jina Reader Helps Better Grounding
Previously, users could easily prepend https://r.jina.ai
to read text and image content from a particular URL into an LLM-friendly format and use it for check grounding and fact verification. Since its first release on April 15th, we have served over 18 million requests from the world, suggesting its popularity.
Today we are excited to move the needle further by introducing the search grounding API https://s.jina.ai
. By simply prepending it before your query, Reader will search the web and retrieve the top 5 results. Each result includes a title, LLM-friendly markdown (full content! not abstract), and a URL that allows you to attribute the source. Here is an example below, you are also encouraged to try our live demo here.
There are three principles when we designing the search grounding in the Reader:
- Improve factuality;
- Access up-to-date information, i.e., world knowledge;
- Connect an answer to its source.
Besides being extremely easy to use, s.jina.ai
is also highly scalable and customizable as it leverages the existing flexible and scalable infrastructure of r.jina.ai
. You can set parameters to control the image captioning, filter granularity, etc., via the request headers.
Jina Reader as a Comprehensive Grounding Solution
If we combine search grounding (s.jina.ai
) and check grounding (r.jina.ai
), we can build a very comprehensive grounding solution for LLMs, agents, and RAG systems. In a typical trustworthy RAG workflow, Jina Reader works as follows:
- User inputs a question;
- Retrieve the latest information from the web using
s.jina.ai
; - Generate an initial answer with a citation to the search result from the last step;
- Use
r.jina.ai
to ground the answer with your own URL; or read the inline URLs from the source returned from step 3 to get deeper grounding; - Final answer generation and highlight potentially ungrounded claims to the user.
Higher Rate Limit with API Keys
Users can enjoy the new search grounding endpoint for free without authorization. Moreover, when providing a Jina AI API key in the request header (the same key can be used in the Embedding/Reranking API), you can immediately enjoy 200 requests per minute per IP for r.jina.ai
and 40 requests per minute per IP for s.jina.ai
. The details can be found in the table below:
Endpoint | Description | Rate limit w/o API key | Rate limit with API key | Token counting scheme | Average latency |
---|---|---|---|---|---|
r.jina.ai | Read a URL return its content, useful for check grounding | 20 RPM | 200 RPM | Based on the output tokens | 3 seconds |
s.jina.ai | Search on the web return top-5 results, useful for search grounding | 5 RPM | 40 RPM | Based on the output tokens for all 5 search results | 30 seconds |
Conclusion
We believe grounding is essential for GenAI applications, and building grounded solutions should be easy for everyone. That's why we introduced the new search grounding endpoint, s.jina.ai
, which allows developers to easily incorporate world knowledge into their GenAI applications. We want developers to establish user trust, provide explainable answers, and inspire curiosity in millions of users.