Contact sales

Grow your business with Jina AI.

Shortcut

FAQ

How to get my API key?

What's the rate limit?

Rate Limit

Rate limits are tracked in two ways: RPM (requests per minute) and TPM (tokens per minute). Limits are enforced per IP and can be reached based on whichever threshold—RPM or TPM—is hit first.

Columns

Product	API Endpoint	Description	w/o API Key	w/ API Key	w/ Premium API Key	Average Latency	Token Usage Counting	Allowed Request
Embedding API	`https://api.jina.ai/v1/embeddings`	Convert text/images to fixed-length vectors		500 RPM & 1,000,000 TPM	2,000 RPM & 5,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Reranker API	`https://api.jina.ai/v1/rerank`	Tokenize and segment long text		500 RPM & 1,000,000 TPM	2,000 RPM & 5,000,000 TPM	depends on the input size	Count the number of tokens in the input request.	POST
Reader API	`https://r.jina.ai`	Convert URL to LLM-friendly text	20 RPM	200 RPM	1000 RPM	4.6s	Count the number of tokens in the output response.	GET/POST
Reader API	`https://s.jina.ai`	Search the web and convert results to LLM-friendly text		40 RPM	100 RPM	8.7s	Count the number of tokens in the output response.	GET/POST
Reader API	`https://g.jina.ai`	Grounding a statement with web knowledge		10 RPM	30 RPM	22.7s	Count the total number of tokens in the whole process.	GET/POST
Classifier API (Zero-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using zero-shot classification		200 RPM & 500,000 TPM	1,000 RPM & 3,000,000 TPM	depends on the input size	Tokens counted as: input_tokens + label_tokens	POST
Classifier API (Few-shot)	`https://api.jina.ai/v1/classify`	Classify inputs using a trained few-shot classifier		20 RPM & 200,000 TPM	60 RPM & 1,000,000 TPM	depends on the input size	Tokens counted as: input_tokens	POST
Classifier API	`https://api.jina.ai/v1/train`	Train a classifier using labeled examples		20 RPM & 200,000 TPM	60 RPM & 1,000,000 TPM	depends on the input size	Tokens counted as: input_tokens × num_iters	POST
Segmenter API	`https://segment.jina.ai`	Tokenize and segment long text	20 RPM	200 RPM	1,000 RPM	0.3s	Token is not counted as usage.	GET/POST

Do I need a commercial license?

CC BY-NC License Self-Check

Are you using our official API or official images on Azure or AWS?

Yes

Are you using a paid API key or free trial key?

Are you using our official model images on AWS and Azure?

Other questions

What are the costs associated with using the Reader API?

How does the Reader API function?

Is the Reader API open source?

What is the typical latency for the Reader API?

Why should I use the Reader API instead of scraping the page myself?

Does the Reader API support multiple languages?

What should I do if a website blocks the Reader API?

Can the Reader API extract content from PDF files?

Can the Reader API process media content from web pages?

Is it possible to use the Reader API on local HTML files?

Does Reader API cache the content?

Can I use the Reader API to access content behind a login?

Can I use the Reader API to access PDF on arXiv?

How does image caption work in Reader?

What is the scalability of the Reader? Can I use it in production?

What is the rate limit of the Reader API?

How much does the Reranker API cost?

What is the difference between the two rerankers?

Is Jina Reranker open source?

Does the reranker support multiple languages?

What is the maximum length for queries and documents?

What is the maximum number of documents I can rerank per query?

What is the batch size and how many query-document tuples can I send in one request?

What latency can I expect when reranking 100 documents?

	Number of tokens in each document
Number of tokens in the query	256	512	1024	2048	4096
64	156	323	1366	2107	3571
128	194	369	1377	2123	3598
256	273	475	1397	2155	4299
512	468	1385	2114	3536	7068

Can I deploy Jina Reranker on AWS?

Do you offer a fine-tuned reranker on domain-specific data?

How were the jina-embeddings-v2 models trained?

What is jina-clip-v1, can I use it for search text and image?

Which languages do your models support?

What is the maximum length for a single sentence input?

What is the maximum number of sentences I can include in a single request?

How do I send images to the jina-clip-v1 model?

How do Jina Embeddings models compare to OpenAI's text-embedding-ada-002 model?

How seamless is the transition from OpenAI's text-embedding-ada-002 to your solution?

How tokens are calculated when using jina-clip-v1?

Do you provide models for embedding images or audio?

Can Jina Embedding models be fine-tuned with private or company data?

Can your endpoints be hosted privately on AWS, Azure, or GCP?

What's different about labels in zero-shot vs few-shot?

What's num_iters for and how should I use it?

How does public classifier sharing work?

How much data do I need for few-shot to work well?

Can it handle multiple languages and both text/images?

What are the hard limits I should know about?

How do I handle data changes over time?

What happens to my training data after I send it?

Zero-shot vs few-shot - when to use which?

Can I use different models for different languages/tasks?

How much does the Segmenter API cost?

If I don't provide an API key, what is the rate limit?

If I provide an API key, what is the rate limit?

Will you charge the tokens from my API key?

Does the Segmenter API support multiple languages?

What is the difference between GET and POST requests?

What is the maximum length I can tokenize per request?

How does the chunking feature work? Is it semantic chunking?

How do you handle special tokens such as 'endoftext' in the Segmenter API?

Does chunking support other languages than English?

How much does the Fine-tuning API cost?

What do I need to input? Do I need to provide training data?

How long does it take to fine-tune a model?

Where are the fine-tuned models stored?

If I provide a reference URL, how does the system use it?

Can I fine-tune a model for a specific language?

Can I fine-tune non-Jina embeddings, e.g., bge-M3?

How do you ensure the quality of the fine-tuned models?

How do you generate synthetic data?

Can I keep my fine-tuned models and synthetic data private?

How can I use the fine-tuned model?

I never received the email with the evaluation results. What should I do?

Can I use the same API key for embedding, reranking, reader, fine-tuning APIs?

Can I monitor the token usage of my API key?

What should I do if I forget my API key?

Do API keys expire?

Why is the first request for some models slow?

Is user input data used for training your models?

Is billing based on the number of sentences or requests?

Is there a free trial available for new users?

Are tokens charged for failed requests?

What payment methods are accepted?

Is invoicing available for token purchases?