Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Touitomamout] Tweets not syncing #113

Open
eudemocracystiftung opened this issue Dec 13, 2024 · 15 comments
Open

[Touitomamout] Tweets not syncing #113

eudemocracystiftung opened this issue Dec 13, 2024 · 15 comments
Labels
bug Something isn't working

Comments

@eudemocracystiftung
Copy link

Hi,
We are using the twitter-scrapper via Touitomamout and are encountering this issue: https://github.com/louisgrasset/touitomamout/issues/241

Basically, the content-mapper does not find any tweets. So the whole process goes smoothly with no error messages, but nothing gets synced.

Any ideas?
Thanks!

@karashiiro
Copy link
Collaborator

Not sure if this is the issue or if there are multiple separate issues right now, but I ran the scraper tests just now and couldn't log in:

    Unknown subtask ArkoseLogin

      109 |         next = await this.handleSuccessSubtask(next);
      110 |       } else {
    > 111 |         throw new Error(`Unknown subtask ${next.subtask.subtask_id}`);
          |               ^
      112 |       }
      113 |     }
      114 |     if ('err' in next) {

      at TwitterUserAuth.login (src/auth-user.ts:111:15)
      at Scraper.login (src/scraper.ts:410:5)
      at getScraper (src/test-utils.ts:55:5)
      at Object.<anonymous> (src/tweets.test.ts:331:19)

The login flow involves a bunch of task messaging back and forth, and I haven't seen this task name before, so it's possible that the login flow changed, in which case a bunch of APIs will stop working. Still need to figure out how exactly this changed, though.

@karashiiro karashiiro added the bug Something isn't working label Dec 13, 2024
@eudemocracystiftung
Copy link
Author

Thanks @karashiiro. Any idea whether this can be fixed in the medium term?

@karashiiro
Copy link
Collaborator

Still looking into it, but I'm not sure how soon it can be fixed, since I haven't deciphered the new login flow yet.

I'll be out of the country for a few weeks so I don't have much time to work on this before the end of the year — hoping I can figure this out within the week.

@karashiiro
Copy link
Collaborator

karashiiro commented Dec 14, 2024

Actually, when I tried debugging this just now, I manually logged into the account I was using for unit tests to compare against the real login flow for that particular account in the browser console. It was the same as what it had been before it broke, and then when I ran the unit tests again they all worked as expected.

I wonder if this means this issue was some kind of suspicious account check, and logging in through the website cleared it? At any rate, there doesn't seem to be any new fundamental issue with the scraper. I do still want to understand why this alternate auth path happened before I logged in on the website, though.

@karashiiro karashiiro changed the title Tweets not syncing Unknown subtask ArkoseLogin after not manually logging into an account for some time Dec 14, 2024
@karashiiro
Copy link
Collaborator

So as for next steps, I suggest manually logging into your scraping account once and seeing if that magically fixes things.

Ideally, when you do that, log in from the same geographical region (or IP if possible) as your service to avoid accidentally tripping any anti-phishing systems.

@karashiiro
Copy link
Collaborator

karashiiro commented Dec 14, 2024

Oh, I just reread the touitomamout issue, this happens with a guest login as well. Possibly a separate issue here, then.

edit: wait no I misread the original issue, auth issues are still the most likely cause

@eudemocracystiftung
Copy link
Author

Fair enough, I can try that and thanks for looking into it. However, when this first happened, I was also logging in with no issues on the side. I will try and make sure that I do so from the same place and report back later tonight.

@karashiiro
Copy link
Collaborator

karashiiro commented Dec 14, 2024

I just ran the repo's CI action and saw the same issue there (logs), which is another point in favor of this explanation. I was still able to run the tests from my local system that I was logged in on. The anti-phishing checks weren't this strict before, so they might've locked things down even more strictly recently.

@eudemocracystiftung
Copy link
Author

Logged into the twitter account, was asked to provide a code sent by email, then all worked fine. Tried touitomamout and got the same result as always.

2024-12-14 22:29:17 touitomamout  | 
2024-12-14 22:29:17 touitomamout  | [email protected]
2024-12-14 22:29:17 touitomamout  | 
2024-12-14 22:29:17 touitomamout  | ⚙️ cache        ✔ task finished
2024-12-14 22:29:17 touitomamout  | - connecting to twitter...
2024-12-14 22:29:18 touitomamout  | 🦤 client       ✔ connected (session restored)
2024-12-14 22:29:19 touitomamout  | - connecting to bluesky...
2024-12-14 22:29:20 touitomamout  | ☁️ client       ✔ connected
2024-12-14 22:29:22 touitomamout  | profile-sync    ✔ task finished
2024-12-14 22:29:29 touitomamout  | content-mapper  ✔ tweets: total: 0 retweets: 0 replies: 0 quotes: 0
2024-12-14 22:29:29 touitomamout  | content-mapper  ✔ task finished
2024-12-14 22:29:29 touitomamout  | 
2024-12-14 22:29:29 touitomamout  | 🦤 → 🦣+☁️
2024-12-14 22:29:29 touitomamout  | Touitomamout sync | v1.8.0
2024-12-14 22:29:29 touitomamout  | | Twitter handle: @appf_eu
2024-12-14 22:29:29 touitomamout  | | 00000  ʲᵘˢᵗ ˢʸⁿᶜᵉᵈ ᵖᵒˢᵗˢ
2024-12-14 22:29:29 touitomamout  | | 00000  ˢʸⁿᶜᵉᵈ ᵖᵒˢᵗˢ ˢᵒ ᶠᵃʳ
2024-12-14 22:29:29 touitomamout  | Run daemon every 30min

@eudemocracystiftung
Copy link
Author

Actually, logged out, changed browser, logged back in, this time no email confirmation asked. Still same output as above.

@karashiiro
Copy link
Collaborator

karashiiro commented Dec 16, 2024

I went ahead and cloned touitomamout to try and get a better understanding of what's happening, and I'm fairly certain this is an issue on their end (either a bug or more likely a documentation issue).

Dropping a breakpoint here (and enabling sourcemaps from Vite) shows that the scraper is able to retrieve tweets, it's just that getEligibleTweet is returning undefined, signaling that everything in your account was filtered out. This is the line of code that determines whether or not a post is replicated - breaking it down, it roughly means:

  1. The post must not be a retweet.
  2. The post must not be a comment on one of your own posts.
  3. The post must not be a QRT of one of your own posts.
  4. The post must have been made in the past 3 weeks.

@karashiiro karashiiro changed the title Unknown subtask ArkoseLogin after not manually logging into an account for some time [Touitomamout] Tweets not syncing Dec 16, 2024
@karashiiro
Copy link
Collaborator

Copied the comment to the original issue for tracking purposes

@eudemocracystiftung
Copy link
Author

I went ahead and cloned touitomamout to try and get a better understanding of what's happening, and I'm fairly certain this is an issue on their end (either a bug or more likely a documentation issue).

Dropping a breakpoint here (and enabling sourcemaps from Vite) shows that the scraper is able to retrieve tweets, it's just that getEligibleTweet is returning undefined, signaling that everything in your account was filtered out. This is the line of code that determines whether or not a post is replicated - breaking it down, it roughly means:

1. The post must not be a retweet.

2. The post must not be a comment on one of your own posts.

3. The post must not be a QRT of one of your own posts.

4. The post must have been made in the past 3 weeks.

This is awesome! Thanks for digging into this so quickly! We'll keep our fingers crossed for the resolution of this issue.

@danbednarski
Copy link

Experiencing a similar problem with https://github.com/ai16z/eliza-starter.

I ran the project with valid Twitter credentials and crash with the following error:

Error: Unknown subtask ArkoseLogin
    at TwitterUserAuth.login (file:///Users/danielbednarski/Documents/eliza-starter/node_modules/.pnpm/[email protected]/node_modules/agent-twitter-client/dist/node/esm/index.mjs:351:15)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async Scraper.login (file:///Users/danielbednarski/Documents/eliza-starter/node_modules/.pnpm/[email protected]/node_modules/agent-twitter-client/dist/node/esm/index.mjs:2280:5)
    at async _ClientBase.init (file:///Users/danielbednarski/Documents/eliza-starter/node_modules/.pnpm/@[email protected]_@[email protected]_@[email protected]_opena_5krollsqbthh335txyixvqmavu/node_modules/@ai16z/client-twitter/dist/index.js:1015:7)
    at async Object.start (file:///Users/danielbednarski/Documents/eliza-starter/node_modules/.pnpm/@[email protected]_@[email protected]_@[email protected]_opena_5krollsqbthh335txyixvqmavu/node_modules/@ai16z/client-twitter/dist/index.js:1377:5)
    at async initializeClients (file:///Users/danielbednarski/Documents/eliza-starter/src/index.ts:132:32)
    at async startAgent (file:///Users/danielbednarski/Documents/eliza-starter/src/index.ts:189:25)
    at async startAgents (file:///Users/danielbednarski/Documents/eliza-starter/src/index.ts:211:13)

Other users report the same problem. Happy to help debug this and provide any additional info which may be useful.

@karashiiro
Copy link
Collaborator

Creating #114 for that as it's a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants