Skip to content

Conversation

@bliuchak
Copy link
Contributor

@bliuchak bliuchak commented Oct 8, 2025

This PR implements HTTPS server support to proxy-chain.

Also fixed:

  • a datarace for error event when we might log same events
  • fix for usage statistics trackings

Otherwise my changes should be fully compatible with HTTP server and all the handlers.

Readiness checklist:

  • Implement HTTPS server
    • Basic implementation
    • Tooling to run HTTPS locally
    • Tests
  • Current state TLS overhead analysis
    • Investigate where TLS overhead might be possible (both in legacy and https implementation)
    • Verify TLS overhead tracked correctly for these cases
      • Forward via HTTPS upstream
      • Chain via HTTPS upstream
  • Implement TLS overhead bytes tracking

- Fix a datarace for error handler
- Add a regression test that verify datarace fix
- Add TLS defaults for better security
@github-actions github-actions bot added t-core-services Issues with this label are in the ownership of the core services team. tested Temporary label used only programatically for some analytics. labels Oct 8, 2025
@bliuchak bliuchak added the t-unblocking Issues with this label are in the ownership of the unblocking team. label Oct 8, 2025
@jirimoravcik
Copy link
Member

Also fixed:

  • a datarace for error event when we might log same events
  • fix for usage statistics trackings

Could you please point me to the changes that are related to the fixes? Thanks. Also, what was wrong with the statistics?

@bliuchak
Copy link
Contributor Author

bliuchak commented Oct 9, 2025

Also fixed:

  • a datarace for error event when we might log same events
  • fix for usage statistics trackings

Could you please point me to the changes that are related to the fixes? Thanks. Also, what was wrong with the statistics?

  1. Datarace
    1. Fix - e6adb19#diff-8a8ae07582c9d433ec8c2e5c4310ff8901e604f4965c5b90a49117ad46c47595R335
    2. Regression tests - https://github.com/apify/proxy-chain/pull/602/files#diff-d14cbfb50ed1cad7db5f4fef6a6076961b7cc9be980a3be06a70998f0eb8ebceR1456-R1599
  2. Statistics
    1. Fix 313f535#diff-8a8ae07582c9d433ec8c2e5c4310ff8901e604f4965c5b90a49117ad46c47595R658-R659
    2. Regression tests - https://github.com/apify/proxy-chain/pull/602/files#diff-d14cbfb50ed1cad7db5f4fef6a6076961b7cc9be980a3be06a70998f0eb8ebceR830-R871

Also, what was wrong with the statistics?

Don't remember right now for 100%, but few tests failed for https scenarios. I believe there was some issues related with undefined values for statistics.

Copy link
Member

@jirimoravcik jirimoravcik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, had a few comments.
In addition to that, could you please:

  1. Bump the package version
  2. Describe all the new things in README.md (which serves as the primary user-facing documentation)
    Thanks

@bliuchak
Copy link
Contributor Author

@jirimoravcik @lewis-wow Guys, I've added main logic for TLS overhead bytes. Please take a look 🙏

Gonna polish tests in meantime and push 'em ASAP.


// Check once per connection for socket._parent availability.
if (this.serverType === 'https') {
const rawSocket = socket._parent;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth asking in https://github.com/nodejs/node about the safety of using this private property.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I know the answer :) as long as unit tests cover the eventuality this will get removed, I think we're good

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not just about the presence of the _parent property, but about the overall usability for stats tracking. I mean, if you consider yourself an expert on the TLS implementation in Node.js, that’s great. :)

Copy link
Contributor Author

@bliuchak bliuchak Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi: nodejs/help#5111 🤞

Copy link
Contributor

@lewis-wow lewis-wow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Have nothing to add.

Copy link
Member

@jirimoravcik jirimoravcik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few more points for discussion

Comment on lines +231 to +249
if (options.serverType === 'https') {
if (!options.httpsOptions) {
throw new Error('httpsOptions is required when serverType is "https"');
}

// Apply secure TLS defaults (user options can override)
// This prevents users from accidentally configuring insecure TLS settings
const secureDefaults: https.ServerOptions = {
...HTTPS_DEFAULTS,
honorCipherOrder: true, // Server chooses cipher (prevents downgrade attacks)
...options.httpsOptions, // User options override defaults
};

this.server = https.createServer(secureDefaults);
this.serverType = 'https';
} else {
this.server = http.createServer();
this.serverType = 'http';
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe validate if options.serverType is one of http, https? It would make it consistent with the type. I'd also set it to http by default in the constructor parameter. That way you could just do this.serverType = options.serverType

socket.proxyChainErrorHandled = true;

// Log errors only if there are no user-provided error handlers
if (this.listenerCount('error') === 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was === 1 before, why is it === 0 now?

Comment on lines +752 to 778
// Socket contains application bytes only.
let srcTxBytes = socket.bytesWritten ?? 0;
let srcRxBytes = socket.bytesRead ?? 0;

if (this.serverType === 'https' && socket.tlsOverheadAvailable) {
/* eslint no-underscore-dangle: ["error", { "allow": ["_parent"] }] */
// Access underlying raw socket to get total bytes (app + TLS overhead).
const rawSocket = socket._parent;
if (rawSocket && typeof rawSocket.bytesWritten === 'number' && typeof rawSocket.bytesRead === 'number') {
if (rawSocket.bytesWritten >= socket.bytesWritten && rawSocket.bytesRead >= socket.bytesRead) {
srcTxBytes = rawSocket.bytesWritten;
srcRxBytes = rawSocket.bytesRead;
} else {
// This should never happen, log for debugging.
this.log(connectionId, `Warning: TLS overhead count error.`);
}
}
}

const targetStats = getTargetStats(socket);

const result = {
srcTxBytes: socket.bytesWritten,
srcRxBytes: socket.bytesRead,
trgTxBytes: targetStats.bytesWritten,
trgRxBytes: targetStats.bytesRead,
return {
srcTxBytes, // HTTP: app only, HTTPS: total (app + TLS overhead)
srcRxBytes, // HTTP: app only, HTTPS: total (app + TLS overhead)
trgTxBytes: targetStats?.bytesWritten,
trgRxBytes: targetStats?.bytesRead,
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an idea. Why don't we store the _parent socket in this.connections? Looking at the logic here in getConnectionsStats, you just fall back to the original socket for stats. That makes me question why would we ever want to use the TLS socket for connection tracking?
That gives me another idea, if you do this.server.on('connection') for HTTPS you should be able to reach the original Socket without using _parent, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-core-services Issues with this label are in the ownership of the core services team. t-unblocking Issues with this label are in the ownership of the unblocking team. tested Temporary label used only programatically for some analytics.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants