Skip to content

SyntaxError: Unexpected token '<', " <h"... is not valid JSON #1892

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Droppix opened this issue Mar 13, 2025 · 9 comments
Open

SyntaxError: Unexpected token '<', " <h"... is not valid JSON #1892

Droppix opened this issue Mar 13, 2025 · 9 comments
Labels
bug Something isn't working needs more info This issue needs a minimal complete and verifiable example

Comments

@Droppix
Copy link

Droppix commented Mar 13, 2025

Description
I try to index only 1 document with 3 attributes (size of document => 41 KB)
I get an error: SyntaxError: Unexpected token '<', " <h"... is not valid JSON

Expected behavior
1/ If I send the command using curl, I have no errors

Request =>
curl -X PUT http://192.168.1.62:7700/indexes/cdfd3a26-d62a-484c-b78f-fb54805335e0/documents -H ‘Content-Type: application/json’ -H ‘Authorization: Bearer xxxxxxx’ --data-binary @text.json

Response =>
{‘taskUid’:88,‘indexUid’:‘cdfd3a26-d62a-484c-b78f-fb54805335e0’,‘status’:‘enqueued’,‘type’:‘documentAdditionOrUpdate’,‘enqueuedAt’:‘2025-03-12T14:47:31.554279Z’}

2/ If I send the same json with your npm package, and I call your method‘addDocuments([json]) or addDocuments(json)’, I get an exception :

SyntaxError: Unexpected token '<', "
<h"... is not valid JSON

const { MeiliSearch } = require('meilisearch')
const fs = require('fs-extra')

const client = new MeiliSearch({
    host: 'https://xxxxxx',
    apiKey: 'xxxxxx',
})

let content = fs.readFileSync('./text.json', 'utf-8')

// json content
// { id: 'cdfd3a26-d62a-484c-b78f-fb54805335e0', category: 'transcript', context: '....' -> only text }


try {
    const index = await client.index('xxxxx')

    let documents = []
    json = fs.readJsonSync('./text.json')
    documents.push(json)

    let response = await index.addDocuments(documents)
    console.log(response)
} catch (err) {
    console.log(err)
}

text.json.zip

Environment (please complete the following information):

  • OS: [Mac OS]
  • Meilisearch version: [last version]
  • meilisearch-js version: "meilisearch": "^0.49.0"
  • Browser: [NodeJS v18]
@flevi29
Copy link
Collaborator

flevi29 commented Mar 13, 2025

Hey @Droppix, thanks for the report and sorry for the trouble. I'll look into it.

@Droppix
Copy link
Author

Droppix commented Mar 13, 2025

Other remarks:
This indexing error only occurs on content > 40KB, on my ningx. Meilisearch display an error on console:

025-03-13T17:57:25.742610Z  INFO actix_server::builder: starting 12 workers
2025-03-13T17:57:25.742640Z  INFO actix_server::server: Actix runtime found; starting in Actix runtime
2025-03-13T17:57:44.775400Z ERROR actix_http::h1::dispatcher: stream error: request parse error: invalid HTTP version specified
2025-03-13T17:57:59.065578Z  INFO HTTP request{method=POST host="192.168.1.64:7700" route=/indexes/cdfd3a26-d62a-484c-b78f-fb54805335e0/documents query_parameters= user_agent=node status_code=202}: meilisearch: close time.busy=2.70ms time.idle=56.6ms
2025-03-13T17:57:59.312943Z  INFO index_scheduler::scheduler::process_index_operation: document indexing done indexing_result=DocumentAdditionResult { indexed_documents: 1, number_of_documents: 4 } processed_in=245.772042ms
2025-03-13T17:57:59.314469Z  INFO index_scheduler::scheduler: A batch of tasks was successfully completed with 1 successful tasks and 0 failed tasks.
2025-03-13T17:58:00.127851Z  INFO HTTP request{method=DELETE host="192.168.1.64:7700" route=/indexes/cdfd3a26-d62a-484c-b78f-fb54805335e0/documents/bfdb2101-6ede-4f7f-bd82-b99c612525f4 query_parameters= user_agent=node status_code=202}: meilisearch: close time.busy=400µs time.idle=1.87ms
2025-03-13T17:58:00.391754Z  INFO index_scheduler::scheduler::process_index_operation: document indexing done indexing_result=DocumentAdditionResult { indexed_documents: 1, number_of_documents: 3 } processed_in=263.6115ms
2025-03-13T17:58:00.392587Z  INFO index_scheduler::scheduler: A batch of tasks was successfully completed with 1 successful tasks and 0 failed tasks.
2025-03-13T17:59:10.081291Z ERROR actix_http::h1::dispatcher: stream error: request parse error: invalid HTTP version specified

FYI: my nginx conf is:

 location /bigquery {
        #proxy_read_timeout 300s;
        #proxy_connect_timeout 75s;

        proxy_pass              http://192.168.1.64:7700/;
        access_log              /access.log custom;
        error_log               /error.log;

        #client_max_body_size   150M;
        #client_body_buffer_size 100K;

        #proxy_http_version 1.1;
        #proxy_buffers 8 64k;
        #proxy_buffer_size 128k;
        #proxy_cache_bypass $http_upgrade;

        #proxy_redirect         off;
        #proxy_set_header       Host    $http_host;
        #proxy_set_header       X-Real-IP $remote_addr;
        #proxy_set_header       X-Forwarded-For $proxy_add_x_forwarded_for;
        #proxy_set_header       X-Forwarded-Proto https;

        proxy_set_header        Host $host;
        proxy_set_header        X-Real-IP $remote_addr;
        proxy_set_header        Accept-Encoding "";
        proxy_set_header        X-Forwarded-Scheme $scheme;
        proxy_set_header        X-Forwarded-Proto $scheme;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;

    }

    #listen [::]:443 ssl; # managed by Certbot
    #listen 443 ssl; # managed by Certbot

    listen 443 http2 ssl;
    listen [::]:443 http2 ipv6only=on ssl;
}

@flevi29
Copy link
Collaborator

flevi29 commented Mar 14, 2025

JSON file does not seem to be a valid JSON file. I tried fixing it here: text.json.
With this I could successfully add the document with the latest unreleased main branch of meilisearch with the following code:

import { readFileSync } from "node:fs";
import { MeiliSearch } from "./dist/esm/index.js";

const client = new MeiliSearch({
  host: "http://127.0.0.1:7700",
  apiKey: "masterKey",
});
const INDEX_NAME = "index";
await client.deleteIndexIfExists(INDEX_NAME);
const index = client.index(INDEX_NAME);

const content = JSON.parse(readFileSync("./text.json"));

const enqueuedTask = await index.addDocuments([content]);
const task = await index.waitForTask(enqueuedTask.taskUid);
console.log(task);

const response = await index.getDocument(content.id);
console.log(response);

So either #1741 fixed it or we just need a better reproduction.

@Droppix Can you provide a reproducible example, please?

@flevi29 flevi29 added the needs more info This issue needs a minimal complete and verifiable example label Mar 14, 2025
@Droppix
Copy link
Author

Droppix commented Mar 15, 2025

The JSON file is valid, otherwise CURL would return an error and this is not the case.
Response of CURL:
"{‘taskUid’:88,‘indexUid’:‘cdfd3a26-d62a-484c-b78f-fb54805335e0’,‘status’:‘enqueued’,‘type’:‘documentAdditionOrUpdate’,‘enqueuedAt’:‘2025-03-12T14:47:31.554279Z’}

So, over the last few days, I've tried many tests, and I've noticed that whether it's curl or nodejs, the indexing goes well (no error), but only local network.

However, as soon as you go through HTTPS (on nginx), it doesn't work on either CURL or node.js.
So I think the problem lies with configuration nginx, and that's why I've provided you with my configuration.

And meilisearch display this log (on indexing curl or nodejs, on HTTPS):

2025-03-15T09:56:48.345710Z  INFO actix_server::server: Actix runtime found; starting in Actix runtime
2025-03-15T10:08:30.944109Z  INFO HTTP request{method=GET host="dev-services.xxxxxx.com" route=/stats query_parameters= user_agent=curl/8.7.1 status_code=200}: meilisearch: close time.busy=4.63ms time.idle=1.06ms
2025-03-15T10:08:49.984661Z ERROR actix_http::h1::dispatcher: stream error: request parse error: invalid HTTP version specified

Could you confirm my NGINX configuration with your team? I'm convinced that something is missing, certainly something to do with the size of the packet sent, especially since if you send a packet over 40KB in size, it's fine.

server {
  server_name dev-services.xxxx.com; # managed by Certbot

  location /bigquery {
          #proxy_read_timeout 300s;
          #proxy_connect_timeout 75s;
  
          proxy_pass              http://192.168.1.64:7700/;
          access_log              /access.log custom;
          error_log               /error.log;
  
          #client_max_body_size   150M;
          #client_body_buffer_size 100K;
  
          #proxy_http_version 1.1;
          #proxy_buffers 8 64k;
          #proxy_buffer_size 128k;
          #proxy_cache_bypass $http_upgrade;
  
          #proxy_redirect         off;
          #proxy_set_header       Host    $http_host;
          #proxy_set_header       X-Real-IP $remote_addr;
          #proxy_set_header       X-Forwarded-For $proxy_add_x_forwarded_for;
          #proxy_set_header       X-Forwarded-Proto https;
  
          proxy_set_header        Host $host;
          proxy_set_header        X-Real-IP $remote_addr;
          proxy_set_header        Accept-Encoding "";
          proxy_set_header        X-Forwarded-Scheme $scheme;
          proxy_set_header        X-Forwarded-Proto $scheme;
          proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
  
      }

      listen 443 http2 ssl;
      listen [::]:443 http2 ipv6only=on ssl;

      ssl_certificate /.../fullchain.pem;
      ssl_certificate_key /.../m/privkey.pem;

    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_ciphers 'ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS';
    ssl_prefer_server_ciphers on;

    ssl_session_timeout 1d;
    ssl_session_cache shared:SSL:5m;
    ssl_session_tickets off;

    # Tune settings for fast HTTP/2 uploads
    # https://blog.cloudflare.com/delivering-http-2-upload-speed-improvements/
    http2_body_preread_size 512k;
  }
}

Thanks

@flevi29
Copy link
Collaborator

flevi29 commented Mar 15, 2025

Well, I just tried parsing it and the following is what I get:

import { readFileSync } from "node:fs";
import { MeiliSearch } from "meilisearch";

const client = new MeiliSearch({
  host: "http://127.0.0.1:7700",
  apiKey: "masterKey",
});
const INDEX_NAME = "index";
await client.deleteIndexIfExists(INDEX_NAME);
const index = client.index(INDEX_NAME);

const content = JSON.parse(readFileSync("./text.json"));

const enqueuedTask = await index.addDocuments([content]);
const task = await index.waitForTask(enqueuedTask.taskUid);
console.log(task);

const response = await index.getDocument(content.id);
console.log(response);
<anonymous_script>:2
    id: 'bfdb2101-6ede-4f7f-bd82-b99c612525f4',
    ^

SyntaxError: Expected property name or '}' in JSON at position 6 (line 2 column 5)
    at JSON.parse (<anonymous>)
    at file:///C:/Users/****/testsynerr/index.js:12:22
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)

Node.js v22.14.0

https://jsonlint.com/ is also returning an error for it.

Image

Maybe the file is different from what you're testing, or CURL is making adjustments to it somehow, I'm not sure.

As to the NGINX config, I cannot provide that kind of support, nor can others here I'm pretty sure, maybe @Strift can chime in anyhow, though.

@Droppix
Copy link
Author

Droppix commented Mar 15, 2025

I think the file sent by github wasn't the right one, but I retested and it's the same.

Image

test.json.zip

@flevi29
Copy link
Collaborator

flevi29 commented Mar 15, 2025

Yeah, this one works.

@curquiza curquiza added the bug Something isn't working label Mar 19, 2025
@Droppix
Copy link
Author

Droppix commented Mar 20, 2025

I'm coming back to you to see if you've been able to find a solution, as we're still stuck?

Thanks

@brunoocasali
Copy link
Member

I think this issue is clearly something related to your nginx config @Droppix can you try this config explained here: https://www.meilisearch.com/docs/guides/deployment/running_production#step-5-secure-and-finish-your-setup

Then you can start adding the other configurations one by one until you see what is happening. I can't give you more help because the time is a constraint for me, but that will be my best recommendation.
Also, https://caddyserver.com/docs/automatic-https is a very good an easier alternative to nginx :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs more info This issue needs a minimal complete and verifiable example
Projects
None yet
Development

No branches or pull requests

4 participants