We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trying to get RAG setup with LibreChat. I'm using Docker. Below are the relevant settings in .env.
RAG_API_URL=http://host.docker.internal:8000 RAG_AZURE_OPENAI_API_KEY=xxx RAG_AZURE_OPENAI_ENDPOINT=https://oai-ocioempsent-dev.openai.azure.com EMBEDDINGS_PROVIDER=azure EMBEDDINGS_MODEL=text-embedding-3-small
-- First time uploading a file, I get this error 2025-02-05 15:27:44 rag_api | [nltk_data] Error loading averaged_perceptron_tagger_eng: <urlopen 2025-02-05 15:27:44 rag_api | [nltk_data] error [SSL: CERTIFICATE_VERIFY_FAILED] certificate 2025-02-05 15:27:44 rag_api | [nltk_data] verify failed: unable to get local issuer certificate 2025-02-05 15:27:44 rag_api | [nltk_data] (_ssl.c:1007)> 2025-02-05 15:27:44 rag_api | [nltk_data] Error loading punkt_tab: <urlopen error [SSL: 2025-02-05 15:27:44 rag_api | [nltk_data] CERTIFICATE_VERIFY_FAILED] certificate verify failed: 2025-02-05 15:27:44 rag_api | [nltk_data] unable to get local issuer certificate (_ssl.c:1007)> 2025-02-05 15:27:44 rag_api | 2025-02-05 23:27:44,556 - root - ERROR - Error during file processing: 2025-02-05 15:27:44 rag_api | ********************************************************************** 2025-02-05 15:27:44 rag_api | Resource averaged_perceptron_tagger_eng not found. 2025-02-05 15:27:44 rag_api | Please use the NLTK Downloader to obtain the resource: 2025-02-05 15:27:44 rag_api | 2025-02-05 15:27:44 rag_api | >>> import nltk 2025-02-05 15:27:44 rag_api | >>> nltk.download('averaged_perceptron_tagger_eng') 2025-02-05 15:27:44 rag_api | 2025-02-05 15:27:44 rag_api | For more information see: https://www.nltk.org/data.html 2025-02-05 15:27:44 rag_api | 2025-02-05 15:27:44 rag_api | Attempted to load taggers/averaged_perceptron_tagger_eng/ 2025-02-05 15:27:44 rag_api | 2025-02-05 15:27:44 rag_api | Searched in: 2025-02-05 15:27:44 rag_api | - '/app/nltk_data' 2025-02-05 15:27:44 rag_api | - '/root/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/local/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/local/share/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/local/lib/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/share/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/local/share/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/lib/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/local/lib/nltk_data' 2025-02-05 15:27:44 rag_api | ********************************************************************** 2025-02-05 15:27:44 rag_api | 2025-02-05 15:27:44 rag_api | Traceback: Traceback (most recent call last): 2025-02-05 15:27:44 rag_api | File "/app/main.py", line 476, in embed_file 2025-02-05 15:27:44 rag_api | data = loader.load() 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_core/document_loaders/base.py", line 31, in load 2025-02-05 15:27:44 rag_api | return list(self.lazy_load()) 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_community/document_loaders/unstructured.py", line 107, in lazy_load 2025-02-05 15:27:44 rag_api | elements = self._get_elements() 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_community/document_loaders/powerpoint.py", line 64, in _get_elements 2025-02-05 15:27:44 rag_api | return partition_pptx(filename=self.file_path, **self.unstructured_kwargs) # type: ignore[arg-type] 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/common/metadata.py", line 162, in wrapper 2025-02-05 15:27:44 rag_api | elements = func(*args, **kwargs) 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/chunking/dispatch.py", line 74, in wrapper 2025-02-05 15:27:44 rag_api | elements = func(*args, **kwargs) 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/pptx.py", line 126, in partition_pptx 2025-02-05 15:27:44 rag_api | return list(_PptxPartitioner.iter_presentation_elements(opts)) 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/pptx.py", line 169, in _iter_presentation_elements 2025-02-05 15:27:44 rag_api | yield from self._iter_shape_elements(shape) 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/pptx.py", line 233, in _iter_shape_elements 2025-02-05 15:27:44 rag_api | elif is_possible_narrative_text(text): 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/text_type.py", line 84, in is_possible_narrative_text 2025-02-05 15:27:44 rag_api | if "eng" in languages and (sentence_count(text, 3) < 2) and (not contains_verb(text)): 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/text_type.py", line 186, in contains_verb 2025-02-05 15:27:44 rag_api | pos_tags = pos_tag(text) 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/nlp/tokenize.py", line 78, in pos_tag 2025-02-05 15:27:44 rag_api | parts_of_speech.extend(_pos_tag(tokens)) 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/nltk/tag/init.py", line 168, in pos_tag 2025-02-05 15:27:44 rag_api | tagger = _get_tagger(lang) 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/nltk/tag/init.py", line 110, in get_tagger 2025-02-05 15:27:44 rag_api | tagger = PerceptronTagger() 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/nltk/tag/perceptron.py", line 183, in init 2025-02-05 15:27:44 rag_api | self.load_from_json(lang) 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/nltk/tag/perceptron.py", line 273, in load_from_json 2025-02-05 15:27:44 rag_api | loc = find(f"taggers/averaged_perceptron_tagger{lang}/") 2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/nltk/data.py", line 579, in find 2025-02-05 15:27:44 rag_api | raise LookupError(resource_not_found) 2025-02-05 15:27:44 rag_api | LookupError: 2025-02-05 15:27:44 rag_api | ********************************************************************** 2025-02-05 15:27:44 rag_api | Resource averaged_perceptron_tagger_eng not found. 2025-02-05 15:27:44 rag_api | Please use the NLTK Downloader to obtain the resource: 2025-02-05 15:27:44 rag_api | 2025-02-05 15:27:44 rag_api | >>> import nltk 2025-02-05 15:27:44 rag_api | >>> nltk.download('averaged_perceptron_tagger_eng') 2025-02-05 15:27:44 rag_api | 2025-02-05 15:27:44 rag_api | For more information see: https://www.nltk.org/data.html 2025-02-05 15:27:44 rag_api | 2025-02-05 15:27:44 rag_api | Attempted to load taggers/averaged_perceptron_tagger_eng/ 2025-02-05 15:27:44 rag_api | 2025-02-05 15:27:44 rag_api | Searched in: 2025-02-05 15:27:44 rag_api | - '/app/nltk_data' 2025-02-05 15:27:44 rag_api | - '/root/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/local/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/local/share/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/local/lib/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/share/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/local/share/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/lib/nltk_data' 2025-02-05 15:27:44 rag_api | - '/usr/local/lib/nltk_data' 2025-02-05 15:27:44 rag_api | ********************************************************************** 2025-02-05 15:27:44 rag_api | 2025-02-05 15:27:44 rag_api | 2025-02-05 15:27:44 rag_api | 2025-02-05 23:27:44,558 - root - INFO - Request POST http://rag_api:8000/embed - 400 2025-02-05 15:27:44 LibreChat | 2025-02-05 23:27:44 error: Error uploading vectors The request was made and the server responded with a status code that falls out of the range of 2xx: Request failed with status code 400. Error response data: 2025-02-05 15:27:44 LibreChat | 2025-02-05 15:27:44 LibreChat | 2025-02-05 23:27:44 error: [/files] Error processing file: Request failed with status code 400
Subsequent file upload attempts gets this: 2025-02-05 15:37:23 rag_api | 2025-02-05 23:37:23,143 - root - ERROR - Failed to store data in vector DB | File ID: 2f2f487e-2353-4e7e-8a2f-794822351cef | User ID: 67a3a7196f3d4d63c458eb7a | Error: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)'))) | Traceback: Traceback (most recent call last): 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 466, in _make_request 2025-02-05 15:37:23 rag_api | self._validate_conn(conn) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn 2025-02-05 15:37:23 rag_api | conn.connect() 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 730, in connect 2025-02-05 15:37:23 rag_api | sock_and_verified = _ssl_wrap_socket_and_match_hostname( 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 909, in ssl_wrap_socket_and_match_hostname 2025-02-05 15:37:23 rag_api | ssl_sock = ssl_wrap_socket( 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/util/ssl.py", line 469, in ssl_wrap_socket 2025-02-05 15:37:23 rag_api | ssl_sock = ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/util/ssl.py", line 513, in _ssl_wrap_socket_impl 2025-02-05 15:37:23 rag_api | return ssl_context.wrap_socket(sock, server_hostname=server_hostname) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/ssl.py", line 513, in wrap_socket 2025-02-05 15:37:23 rag_api | return self.sslsocket_class._create( 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/ssl.py", line 1104, in _create 2025-02-05 15:37:23 rag_api | self.do_handshake() 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/ssl.py", line 1375, in do_handshake 2025-02-05 15:37:23 rag_api | self._sslobj.do_handshake() 2025-02-05 15:37:23 rag_api | ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007) 2025-02-05 15:37:23 rag_api | 2025-02-05 15:37:23 rag_api | During handling of the above exception, another exception occurred: 2025-02-05 15:37:23 rag_api | 2025-02-05 15:37:23 rag_api | Traceback (most recent call last): 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen 2025-02-05 15:37:23 rag_api | response = self._make_request( 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 490, in _make_request 2025-02-05 15:37:23 rag_api | raise new_e 2025-02-05 15:37:23 rag_api | urllib3.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007) 2025-02-05 15:37:23 rag_api | 2025-02-05 15:37:23 rag_api | The above exception was the direct cause of the following exception: 2025-02-05 15:37:23 rag_api | 2025-02-05 15:37:23 rag_api | Traceback (most recent call last): 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 667, in send 2025-02-05 15:37:23 rag_api | resp = conn.urlopen( 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 843, in urlopen 2025-02-05 15:37:23 rag_api | retries = retries.increment( 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/util/retry.py", line 519, in increment 2025-02-05 15:37:23 rag_api | raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] 2025-02-05 15:37:23 rag_api | urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)'))) 2025-02-05 15:37:23 rag_api | 2025-02-05 15:37:23 rag_api | During handling of the above exception, another exception occurred: 2025-02-05 15:37:23 rag_api | 2025-02-05 15:37:23 rag_api | Traceback (most recent call last): 2025-02-05 15:37:23 rag_api | File "/app/main.py", line 326, in store_data_in_vector_db 2025-02-05 15:37:23 rag_api | ids = await vector_store.aadd_documents( 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_core/vectorstores/base.py", line 324, in aadd_documents 2025-02-05 15:37:23 rag_api | return await run_in_executor(None, self.add_documents, documents, **kwargs) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 588, in run_in_executor 2025-02-05 15:37:23 rag_api | return await asyncio.get_running_loop().run_in_executor( 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run 2025-02-05 15:37:23 rag_api | result = self.fn(*self.args, **self.kwargs) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 579, in wrapper 2025-02-05 15:37:23 rag_api | return func(*args, **kwargs) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_core/vectorstores/base.py", line 287, in add_documents 2025-02-05 15:37:23 rag_api | return self.add_texts(texts, metadatas, **kwargs) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_community/vectorstores/pgvector.py", line 561, in add_texts 2025-02-05 15:37:23 rag_api | embeddings = self.embedding_function.embed_documents(list(texts)) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_openai/embeddings/base.py", line 588, in embed_documents 2025-02-05 15:37:23 rag_api | return self._get_len_safe_embeddings(texts, engine=engine) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_openai/embeddings/base.py", line 480, in _get_len_safe_embeddings 2025-02-05 15:37:23 rag_api | _iter, tokens, indices = self._tokenize(texts, _chunk_size) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_openai/embeddings/base.py", line 420, in _tokenize 2025-02-05 15:37:23 rag_api | encoding = tiktoken.encoding_for_model(model_name) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken/model.py", line 105, in encoding_for_model 2025-02-05 15:37:23 rag_api | return get_encoding(encoding_name_for_model(model_name)) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken/registry.py", line 86, in get_encoding 2025-02-05 15:37:23 rag_api | enc = Encoding(**constructor()) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken_ext/openai_public.py", line 76, in cl100k_base 2025-02-05 15:37:23 rag_api | mergeable_ranks = load_tiktoken_bpe( 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken/load.py", line 144, in load_tiktoken_bpe 2025-02-05 15:37:23 rag_api | contents = read_file_cached(tiktoken_bpe_file, expected_hash) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken/load.py", line 63, in read_file_cached 2025-02-05 15:37:23 rag_api | contents = read_file(blobpath) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken/load.py", line 24, in read_file 2025-02-05 15:37:23 rag_api | resp = requests.get(blobpath) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/api.py", line 73, in get 2025-02-05 15:37:23 rag_api | return request("get", url, params=params, **kwargs) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/api.py", line 59, in request 2025-02-05 15:37:23 rag_api | return session.request(method=method, url=url, **kwargs) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request 2025-02-05 15:37:23 rag_api | resp = self.send(prep, **send_kwargs) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send 2025-02-05 15:37:23 rag_api | r = adapter.send(request, **kwargs) 2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 698, in send 2025-02-05 15:37:23 rag_api | raise SSLError(e, request=request) 2025-02-05 15:37:23 LibreChat | 2025-02-05 23:37:23 error: Error uploading vectors 2025-02-05 15:37:23 LibreChat | Something happened in setting up the request. Error message: 2025-02-05 15:37:23 LibreChat | File embedding failed. 2025-02-05 15:37:23 rag_api | requests.exceptions.SSLError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)'))) 2025-02-05 15:37:23 rag_api | 2025-02-05 15:37:23 rag_api | 2025-02-05 23:37:23,143 - root - INFO - Request POST http://rag_api:8000/embed - 200 2025-02-05 15:37:23 LibreChat | 2025-02-05 23:37:23 error: [/files] Error processing file: File embedding failed.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Trying to get RAG setup with LibreChat. I'm using Docker. Below are the relevant settings in .env.
RAG_API_URL=http://host.docker.internal:8000
RAG_AZURE_OPENAI_API_KEY=xxx
RAG_AZURE_OPENAI_ENDPOINT=https://oai-ocioempsent-dev.openai.azure.com
EMBEDDINGS_PROVIDER=azure
EMBEDDINGS_MODEL=text-embedding-3-small
--
First time uploading a file, I get this error
2025-02-05 15:27:44 rag_api | [nltk_data] Error loading averaged_perceptron_tagger_eng: <urlopen
2025-02-05 15:27:44 rag_api | [nltk_data] error [SSL: CERTIFICATE_VERIFY_FAILED] certificate
2025-02-05 15:27:44 rag_api | [nltk_data] verify failed: unable to get local issuer certificate
2025-02-05 15:27:44 rag_api | [nltk_data] (_ssl.c:1007)>
2025-02-05 15:27:44 rag_api | [nltk_data] Error loading punkt_tab: <urlopen error [SSL:
2025-02-05 15:27:44 rag_api | [nltk_data] CERTIFICATE_VERIFY_FAILED] certificate verify failed:
2025-02-05 15:27:44 rag_api | [nltk_data] unable to get local issuer certificate (_ssl.c:1007)>
2025-02-05 15:27:44 rag_api | 2025-02-05 23:27:44,556 - root - ERROR - Error during file processing:
2025-02-05 15:27:44 rag_api | **********************************************************************
2025-02-05 15:27:44 rag_api | Resource averaged_perceptron_tagger_eng not found.
2025-02-05 15:27:44 rag_api | Please use the NLTK Downloader to obtain the resource:
2025-02-05 15:27:44 rag_api |
2025-02-05 15:27:44 rag_api | >>> import nltk
2025-02-05 15:27:44 rag_api | >>> nltk.download('averaged_perceptron_tagger_eng')
2025-02-05 15:27:44 rag_api |
2025-02-05 15:27:44 rag_api | For more information see: https://www.nltk.org/data.html
2025-02-05 15:27:44 rag_api |
2025-02-05 15:27:44 rag_api | Attempted to load taggers/averaged_perceptron_tagger_eng/
2025-02-05 15:27:44 rag_api |
2025-02-05 15:27:44 rag_api | Searched in:
2025-02-05 15:27:44 rag_api | - '/app/nltk_data'
2025-02-05 15:27:44 rag_api | - '/root/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/local/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/local/share/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/local/lib/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/share/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/local/share/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/lib/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/local/lib/nltk_data'
2025-02-05 15:27:44 rag_api | **********************************************************************
2025-02-05 15:27:44 rag_api |
2025-02-05 15:27:44 rag_api | Traceback: Traceback (most recent call last):
2025-02-05 15:27:44 rag_api | File "/app/main.py", line 476, in embed_file
2025-02-05 15:27:44 rag_api | data = loader.load()
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_core/document_loaders/base.py", line 31, in load
2025-02-05 15:27:44 rag_api | return list(self.lazy_load())
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_community/document_loaders/unstructured.py", line 107, in lazy_load
2025-02-05 15:27:44 rag_api | elements = self._get_elements()
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_community/document_loaders/powerpoint.py", line 64, in _get_elements
2025-02-05 15:27:44 rag_api | return partition_pptx(filename=self.file_path, **self.unstructured_kwargs) # type: ignore[arg-type]
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/common/metadata.py", line 162, in wrapper
2025-02-05 15:27:44 rag_api | elements = func(*args, **kwargs)
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/chunking/dispatch.py", line 74, in wrapper
2025-02-05 15:27:44 rag_api | elements = func(*args, **kwargs)
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/pptx.py", line 126, in partition_pptx
2025-02-05 15:27:44 rag_api | return list(_PptxPartitioner.iter_presentation_elements(opts))
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/pptx.py", line 169, in _iter_presentation_elements
2025-02-05 15:27:44 rag_api | yield from self._iter_shape_elements(shape)
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/pptx.py", line 233, in _iter_shape_elements
2025-02-05 15:27:44 rag_api | elif is_possible_narrative_text(text):
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/text_type.py", line 84, in is_possible_narrative_text
2025-02-05 15:27:44 rag_api | if "eng" in languages and (sentence_count(text, 3) < 2) and (not contains_verb(text)):
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/partition/text_type.py", line 186, in contains_verb
2025-02-05 15:27:44 rag_api | pos_tags = pos_tag(text)
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/unstructured/nlp/tokenize.py", line 78, in pos_tag
2025-02-05 15:27:44 rag_api | parts_of_speech.extend(_pos_tag(tokens))
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/nltk/tag/init.py", line 168, in pos_tag
2025-02-05 15:27:44 rag_api | tagger = _get_tagger(lang)
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/nltk/tag/init.py", line 110, in get_tagger
2025-02-05 15:27:44 rag_api | tagger = PerceptronTagger()
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/nltk/tag/perceptron.py", line 183, in init
2025-02-05 15:27:44 rag_api | self.load_from_json(lang)
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/nltk/tag/perceptron.py", line 273, in load_from_json
2025-02-05 15:27:44 rag_api | loc = find(f"taggers/averaged_perceptron_tagger{lang}/")
2025-02-05 15:27:44 rag_api | File "/usr/local/lib/python3.10/site-packages/nltk/data.py", line 579, in find
2025-02-05 15:27:44 rag_api | raise LookupError(resource_not_found)
2025-02-05 15:27:44 rag_api | LookupError:
2025-02-05 15:27:44 rag_api | **********************************************************************
2025-02-05 15:27:44 rag_api | Resource averaged_perceptron_tagger_eng not found.
2025-02-05 15:27:44 rag_api | Please use the NLTK Downloader to obtain the resource:
2025-02-05 15:27:44 rag_api |
2025-02-05 15:27:44 rag_api | >>> import nltk
2025-02-05 15:27:44 rag_api | >>> nltk.download('averaged_perceptron_tagger_eng')
2025-02-05 15:27:44 rag_api |
2025-02-05 15:27:44 rag_api | For more information see: https://www.nltk.org/data.html
2025-02-05 15:27:44 rag_api |
2025-02-05 15:27:44 rag_api | Attempted to load taggers/averaged_perceptron_tagger_eng/
2025-02-05 15:27:44 rag_api |
2025-02-05 15:27:44 rag_api | Searched in:
2025-02-05 15:27:44 rag_api | - '/app/nltk_data'
2025-02-05 15:27:44 rag_api | - '/root/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/local/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/local/share/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/local/lib/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/share/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/local/share/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/lib/nltk_data'
2025-02-05 15:27:44 rag_api | - '/usr/local/lib/nltk_data'
2025-02-05 15:27:44 rag_api | **********************************************************************
2025-02-05 15:27:44 rag_api |
2025-02-05 15:27:44 rag_api |
2025-02-05 15:27:44 rag_api | 2025-02-05 23:27:44,558 - root - INFO - Request POST http://rag_api:8000/embed - 400
2025-02-05 15:27:44 LibreChat | 2025-02-05 23:27:44 error: Error uploading vectors The request was made and the server responded with a status code that falls out of the range of 2xx: Request failed with status code 400. Error response data:
2025-02-05 15:27:44 LibreChat |
2025-02-05 15:27:44 LibreChat | 2025-02-05 23:27:44 error: [/files] Error processing file: Request failed with status code 400
Subsequent file upload attempts gets this:
2025-02-05 15:37:23 rag_api | 2025-02-05 23:37:23,143 - root - ERROR - Failed to store data in vector DB | File ID: 2f2f487e-2353-4e7e-8a2f-794822351cef | User ID: 67a3a7196f3d4d63c458eb7a | Error: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)'))) | Traceback: Traceback (most recent call last):
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 466, in _make_request
2025-02-05 15:37:23 rag_api | self._validate_conn(conn)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
2025-02-05 15:37:23 rag_api | conn.connect()
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 730, in connect
2025-02-05 15:37:23 rag_api | sock_and_verified = _ssl_wrap_socket_and_match_hostname(
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 909, in ssl_wrap_socket_and_match_hostname
2025-02-05 15:37:23 rag_api | ssl_sock = ssl_wrap_socket(
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/util/ssl.py", line 469, in ssl_wrap_socket
2025-02-05 15:37:23 rag_api | ssl_sock = ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/util/ssl.py", line 513, in _ssl_wrap_socket_impl
2025-02-05 15:37:23 rag_api | return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/ssl.py", line 513, in wrap_socket
2025-02-05 15:37:23 rag_api | return self.sslsocket_class._create(
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/ssl.py", line 1104, in _create
2025-02-05 15:37:23 rag_api | self.do_handshake()
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/ssl.py", line 1375, in do_handshake
2025-02-05 15:37:23 rag_api | self._sslobj.do_handshake()
2025-02-05 15:37:23 rag_api | ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)
2025-02-05 15:37:23 rag_api |
2025-02-05 15:37:23 rag_api | During handling of the above exception, another exception occurred:
2025-02-05 15:37:23 rag_api |
2025-02-05 15:37:23 rag_api | Traceback (most recent call last):
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
2025-02-05 15:37:23 rag_api | response = self._make_request(
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 490, in _make_request
2025-02-05 15:37:23 rag_api | raise new_e
2025-02-05 15:37:23 rag_api | urllib3.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)
2025-02-05 15:37:23 rag_api |
2025-02-05 15:37:23 rag_api | The above exception was the direct cause of the following exception:
2025-02-05 15:37:23 rag_api |
2025-02-05 15:37:23 rag_api | Traceback (most recent call last):
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 667, in send
2025-02-05 15:37:23 rag_api | resp = conn.urlopen(
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 843, in urlopen
2025-02-05 15:37:23 rag_api | retries = retries.increment(
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/urllib3/util/retry.py", line 519, in increment
2025-02-05 15:37:23 rag_api | raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
2025-02-05 15:37:23 rag_api | urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)')))
2025-02-05 15:37:23 rag_api |
2025-02-05 15:37:23 rag_api | During handling of the above exception, another exception occurred:
2025-02-05 15:37:23 rag_api |
2025-02-05 15:37:23 rag_api | Traceback (most recent call last):
2025-02-05 15:37:23 rag_api | File "/app/main.py", line 326, in store_data_in_vector_db
2025-02-05 15:37:23 rag_api | ids = await vector_store.aadd_documents(
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_core/vectorstores/base.py", line 324, in aadd_documents
2025-02-05 15:37:23 rag_api | return await run_in_executor(None, self.add_documents, documents, **kwargs)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 588, in run_in_executor
2025-02-05 15:37:23 rag_api | return await asyncio.get_running_loop().run_in_executor(
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2025-02-05 15:37:23 rag_api | result = self.fn(*self.args, **self.kwargs)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_core/runnables/config.py", line 579, in wrapper
2025-02-05 15:37:23 rag_api | return func(*args, **kwargs)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_core/vectorstores/base.py", line 287, in add_documents
2025-02-05 15:37:23 rag_api | return self.add_texts(texts, metadatas, **kwargs)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_community/vectorstores/pgvector.py", line 561, in add_texts
2025-02-05 15:37:23 rag_api | embeddings = self.embedding_function.embed_documents(list(texts))
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_openai/embeddings/base.py", line 588, in embed_documents
2025-02-05 15:37:23 rag_api | return self._get_len_safe_embeddings(texts, engine=engine)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_openai/embeddings/base.py", line 480, in _get_len_safe_embeddings
2025-02-05 15:37:23 rag_api | _iter, tokens, indices = self._tokenize(texts, _chunk_size)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/langchain_openai/embeddings/base.py", line 420, in _tokenize
2025-02-05 15:37:23 rag_api | encoding = tiktoken.encoding_for_model(model_name)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken/model.py", line 105, in encoding_for_model
2025-02-05 15:37:23 rag_api | return get_encoding(encoding_name_for_model(model_name))
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken/registry.py", line 86, in get_encoding
2025-02-05 15:37:23 rag_api | enc = Encoding(**constructor())
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken_ext/openai_public.py", line 76, in cl100k_base
2025-02-05 15:37:23 rag_api | mergeable_ranks = load_tiktoken_bpe(
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken/load.py", line 144, in load_tiktoken_bpe
2025-02-05 15:37:23 rag_api | contents = read_file_cached(tiktoken_bpe_file, expected_hash)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken/load.py", line 63, in read_file_cached
2025-02-05 15:37:23 rag_api | contents = read_file(blobpath)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/tiktoken/load.py", line 24, in read_file
2025-02-05 15:37:23 rag_api | resp = requests.get(blobpath)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/api.py", line 73, in get
2025-02-05 15:37:23 rag_api | return request("get", url, params=params, **kwargs)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/api.py", line 59, in request
2025-02-05 15:37:23 rag_api | return session.request(method=method, url=url, **kwargs)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
2025-02-05 15:37:23 rag_api | resp = self.send(prep, **send_kwargs)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
2025-02-05 15:37:23 rag_api | r = adapter.send(request, **kwargs)
2025-02-05 15:37:23 rag_api | File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 698, in send
2025-02-05 15:37:23 rag_api | raise SSLError(e, request=request)
2025-02-05 15:37:23 LibreChat | 2025-02-05 23:37:23 error: Error uploading vectors
2025-02-05 15:37:23 LibreChat | Something happened in setting up the request. Error message:
2025-02-05 15:37:23 LibreChat | File embedding failed.
2025-02-05 15:37:23 rag_api | requests.exceptions.SSLError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)')))
2025-02-05 15:37:23 rag_api |
2025-02-05 15:37:23 rag_api | 2025-02-05 23:37:23,143 - root - INFO - Request POST http://rag_api:8000/embed - 200
2025-02-05 15:37:23 LibreChat | 2025-02-05 23:37:23 error: [/files] Error processing file: File embedding failed.
The text was updated successfully, but these errors were encountered: