Skip to content

Params middleware does not decode some strings correctly #269

Open
@iku000888

Description

@iku000888

Hi James, thanks for the continued work on maintaining ring.

Just want to throw this issue out there in case anyone else stumbles upon the same issue, plus gathering some feedback.

In our project, we were dealing with non utf-8 encoded forms (Shift_JIS to be specific), and found that the params middleware is garbling the incoming string, even though we were using the wrap-params middleware with the Shift_JIS option e.g. (param/wrap-params {:encoding "Shift_JIS"})

Looking at the source code deeper inside, I see that java.net.URLDecoder is being used for parsing, and that java.net.URLDecoder does not decode some non utf-8 strings correctly, while org.apache.commons.codec.net.URLCodec can, illustrated in below snippet.

  ;; "モジバケコワイ", URL encoded with Shif_JIS by the browser
  "%83%82%83W%83o%83P%83R%83%8F%83C"

 (import [java.net URLDecoder])
 (URLDecoder/decode "%83%82%83W%83o%83P%83R%83%8F%83C" "Shift-JIS")
 ;; => "モ�W�o�P�Rワ�C" 

 (import [org.apache.commons.codec.net URLCodec])
 (let [codec (URLCodec. "Shift-JIS")]
   (.decode codec "%83%82%83W%83o%83P%83R%83%8F%83C" "Shift-JIS"))
 ;; => "モジバケコワイ"

We came up with 2 work arounds for this issue:

  1. Use org.apache.commons.codec.net.URLCodec instead of java.net.URLDecoder for decoding URL encoded parameteres
  2. Use a form with an enctype of multipart/form-data so that nothing gets encoded and thus avoid the problem entirely

Would appreciate the if I can get feed back on:

  1. Which of the above workaround is preferable?
  2. Would you be interested in a PR that replaces the decoder used in the params middleware with org.apache.commons.codec.net URLCodec?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions