Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suggestion] Instagram chat analysis #114

Open
ShortTimeNoSee opened this issue Jul 22, 2024 · 15 comments
Open

[Suggestion] Instagram chat analysis #114

ShortTimeNoSee opened this issue Jul 22, 2024 · 15 comments
Labels
app Related to the Web app enhancement New feature or request report Related to the report page

Comments

@ShortTimeNoSee
Copy link

Considering you can export your Instagram data in JSON, I think it'd be great if there was a tool out there that could create some visual data of your chat analytics, and this service would be a great home for such a tool.

@mlomb
Copy link
Owner

mlomb commented Jul 22, 2024

Hey there!

create some visual data of your chat analytics

Do you mean exporting the graphs?

@hopperelec
Copy link
Contributor

I think when they said "chat analytics", they were referring to analytics of their chats, not this app

@hopperelec
Copy link
Contributor

hopperelec commented Jul 22, 2024

For the export instructions, we can reference this. For the structure of the JSON file, I've found a C# project which parses the JSON; we obviously can't use the C# itself but, assuming the format hasn't changed much within the past 5 years, it should give us an idea of what may be contained in the file.

@mlomb
Copy link
Owner

mlomb commented Jul 22, 2024

Sorry, I don't know why I read "Telegram" instead of "Instagram" lol

Yeah

@mlomb mlomb added enhancement New feature or request good first issue Good for newcomers app Related to the Web app report Related to the report page pipeline Related to the pipeline labels Jul 22, 2024
@ShortTimeNoSee
Copy link
Author

ShortTimeNoSee commented Jul 22, 2024

Hey there!

create some visual data of your chat analytics

Do you mean exporting the graphs?

I mean like what chatanalytics does for other messaging services in general, like Discord.

When you export your data from Instagram you can choose to export your DMs/GCs (messages) in JSON or HTML format. There doesn't exist anything right now that can analyze your chat data the same way your web app can analyze data (and I haven't found anything that analyzes Instagram data exports at all) and put it in a visual format.

The data exports for messages come out as folders for each person, each folder contain message.json file(s). Example:

So I think, given the know-how, it's possible for someone to create a tool that can analyze the JSON exports from message data requests.

While typing this I see hopperelec's addition to the Issue; here's some snippets of the format of the JSON file for messages (collapsed because it's a bunch):

Click me to expand JSON snippets
{
  "participants": [
    {
      "name": "display name"
    },
    {
      "name": "my display name"
    }
  ],
  "messages": [
    {
      "sender_name": "display name",
      "timestamp_ms": 1692094898816,
      "content": "NOOOO",
      "is_geoblocked_for_viewer": false
    },
    {
      "sender_name": "display name",
      "timestamp_ms": 1692094896923,
      "content": "UR JOKING",
      "is_geoblocked_for_viewer": false
    },
    {
      "sender_name": "my display name",
      "timestamp_ms": 1692094844275,
      "content": "I used to be a YouTube Shorts guy",
      "is_geoblocked_for_viewer": false
    },
    {
      "sender_name": "my display name",
      "timestamp_ms": 1692094828736,
      "content": "I upgraded though",
      "is_geoblocked_for_viewer": false
    },

Here's an instance of a post being shared:

    {
      "sender_name": "my display name",
      "timestamp_ms": 1692094162049,
      "content": "You sent an attachment.",
      "share": {
        "link": "https://www.instagram.com/reel/CtA55o8gd8v/?id=3116745591766507311_53334871481",
        "share_text": "The whole meet feeling the rizz \u00f0\u009f\u0098\u0085\n\n\u00f0\u009f\u0093\u00b9 @lucasnvota \n\n#trackandfield #rizz #crosscountry",
        "original_content_owner": "runnnsphere"
      },
      "is_geoblocked_for_viewer": false
    },

A photo being sent

    {
      "sender_name": "my display name",
      "timestamp_ms": 1692094109242,
      "photos": [
        {
          "uri": "your_instagram_activity/messages/inbox/display name_uniqueID/photos/365768647_242613148165388_3593794921027562063_n_242613144832055.jpg",
          "creation_timestamp": 1692094108
        }
      ],
      "is_geoblocked_for_viewer": false
    },

A message that has a reaction:

    {
      "sender_name": "my display name",
      "timestamp_ms": 1692093826812,
      "content": "talk to him",
      "reactions": [
        {
          "reaction": "\u00f0\u009f\u0092\u00aa",
          "actor": "display name"
        }
      ],
      "is_geoblocked_for_viewer": false
    },

Sent video with reactions:

    {
      "sender_name": "my display name",
      "timestamp_ms": 1692087378497,
      "videos": [
        {
          "uri": "your_instagram_activity/messages/inbox/display name_uniqueID/videos/366765036_9733517726719643_8623716028334454998_n_800614114872039.mp4",
          "creation_timestamp": 1692087373
        }
      ],
      "reactions": [
        {
          "reaction": "\u00e2\u009d\u00a4\u00ef\u00b8\u008f",
          "actor": "display name"
        }
      ],
      "is_geoblocked_for_viewer": false
    },

@hopperelec
Copy link
Contributor

Thanks for the samples, those will be very useful!

@mlomb
Copy link
Owner

mlomb commented Jul 22, 2024

This is totally possible, in fact the structure looks very similar to Messenger exports (both from Meta)

Can you try to load them like they were Messenger exports?


Edit: I want to add that the project is designed in a way that adding new platforms is relatively easy, we could support lots of more platforms eventually. The problem is that companies may not be consistent with their export formats :(

@ShortTimeNoSee
Copy link
Author

ShortTimeNoSee commented Jul 22, 2024

This is totally possible, in fact the structure looks very similar to Messenger exports (both from Meta)

Can you try to load them like they were Messenger exports?

That seems to work pretty well!

image

This part didn't register videos sent or edits (and there isn't anything built in for Reels sent etc.), but those are small aspects. Here's an edited message's JSON though, where I added emojis to the end of the message content:

    {
      "sender_name": "my display name",
      "timestamp_ms": 1720829614035,
      "content": "Last year \u00f0\u009f\u0098\u00ad (edited)",
      "is_geoblocked_for_viewer": false
    },
    {
      "sender_name": "my display name",
      "timestamp_ms": 1720829590324,
      "content": "Last year",
      "is_geoblocked_for_viewer": false
    },

Your web app likely sees this as two messages instead of one message that was edited.

It also doesn't read emojis correctly, only those that don't get turned into Unicode escape sequences or whenever it thinks it sees Discord's emoji format as you can see here:

image

For some reason I was under the impression that there was something that also compared message lengths but I'm not seeing that so I think I'm imagining things (could make an issue for that though, might be a good additional feature).

@hopperelec
Copy link
Contributor

For some reason I was under the impression that there was something that also compared message lengths but I'm not seeing that

Under Language > Language Statistics there is "Average words per message". There is also an issue for this #67

@ShortTimeNoSee
Copy link
Author

Demonstration of it seeing the edited message as separate messages

image

And I just noticed this as well:

image

For some reason I was under the impression that there was something that also compared message lengths but I'm not seeing that

Under Language > Language Statistics there is "Average words per message". There is also an issue for this #67

Oh I don't even think I've even noticed that. Whatever I was thinking of compared the message lengths between the users.

@mlomb
Copy link
Owner

mlomb commented Jul 22, 2024

The edited messages would be really hard to do since (from the JSON you sent) we have no way to connect both messages (no IDs). Right now each message has a time of when it was edited and the latest content. We either skip all "(edited)" messages causing broken edit stats or create ghost empty messages skewing message counts (maybe we could do some dirty trick to skip but idk) Now that I think of, we need to know the time sent and edited to compute the stats.


The emojis thing you are right, we should handle unicode escaped emojis (and symbols for that matter). Maybe in a new issue?

@ShortTimeNoSee
Copy link
Author

we have no way to connect both messages (no IDs)

Yeah that's what I was thinking. Messages can be edited within 15 minutes, but people send multiple messages in that span of time and there's no guaranteeing which message was edited. I think the one thing that can be done relating to this though is excluding messages that end with " (edited)" from any message statistics.
Oh got ahead of myself there you already mentioned that

And I just created an issue for the emoji handling 🙏

@mlomb mlomb removed good first issue Good for newcomers pipeline Related to the pipeline labels Jul 22, 2024
@mlomb
Copy link
Owner

mlomb commented Jul 22, 2024

Nice! I'll let this open for now. We should make clear that Instagram exports are compatible with Messenger exports. Maybe have two buttons in the UI that use the same parser and have different instructions.

And maybe rename "MessengerParser" to "MetaParser" (?)

@hopperelec
Copy link
Contributor

hopperelec commented Jul 22, 2024

MetaParser wouldn't really make sense because WhatsApp is also made by Meta but that has its own parser. Also, I would personally read "MetaParser" as meaning a parser for chat-analytics messages, whatever that means lol

@mlomb
Copy link
Owner

mlomb commented Jul 22, 2024

WhatsApp is also made by Meta

jeez these Meta guys, true


I don't use IG/FB, but now that I recall you can message people between both platforms (interoperable). Maybe that's why it is so similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app Related to the Web app enhancement New feature or request report Related to the report page
Projects
None yet
Development

No branches or pull requests

3 participants