Skip to content

Conversation

@tttrrryyyyeer
Copy link

@tttrrryyyyeer tttrrryyyyeer commented May 5, 2020

Before
image

After
image

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.815 (1909/November2018Update/19H2)
Intel Core i5-3230M CPU 2.60GHz (Ivy Bridge), 1 CPU, 4 logical and 2 physical cores
.NET Core SDK=3.1.201

No breaking changes

@enemaerke
Copy link
Contributor

Hi @esmelov.

Thanks for submitting. I see you've taken the effort to split stuff into multiple files. We've had a previous PR that actually pulled everything into a single file (supposedly for easy embedding into other projects) which this would break that behavior.

@atifaziz, I believe you introduced the single file back in the day. Any comments or objections to the overall change in this PR?

Copy link
Contributor

@atifaziz atifaziz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had a previous PR that actually pulled everything into a single file (supposedly for easy embedding into other projects) which this would break that behavior.

@atifaziz, I believe you introduced the single file back in the day. Any comments or objections to the overall change in this PR?

That's right. This was done in ac4c3fb and now we'd be going backward. Strictly speaking, though, this doesn't qualify as a behavioural change.

Embedding was one advantage as a side-effect of having a single file, but frankly, after refactoring to a simple design, we were down to just 361 lines of code, including 18 lines of the license header region! When it's that small, it doesn't make sense to spread types into separate files for the sake of it. It makes a simple and single-purpose project such as this one difficult to understand and navigate.

Later, UAParser.cs nearly doubled in size (at 668 lines) when @enemaerke added doc comments in a7d31ba but that's a separate point and value to debate.

I am sure @esmelov had all the good intentions with this PR but I have several problems with it:

  1. Too many things rolled into one PR. The subject line lists them all but I would add restructuring and refactoring too. You have to take the whole pie or nothing.
  2. Because of 1, there are so many aspects and files (30!) to review that there must have been an assumption that maintainers will have the time to tear apart everything and review in one sitting. I know I don't.
  3. The maintainers were never asked (by opening an issue) if there's a problem to solve, what changes (besides incidental complexity) would be acceptable.
  4. There is no problem statement that talks about the problem that this PR is solving. Was it performance? If so, what was the bottleneck or hot path?
  5. Code was refactored/restructured to a new design with no justification. You could say it was needed to improve the performance but there's no explanation of what gave better numbers.
  6. The code formatting/style was changed.

Again, I am sure @esmelov had all the good intentions and I commend him for wanting to contribute his ideas back to help improve the project but I think the PR is all over the place and sparse on descriptive details. There's good stuff in there I see like considerable reduction of heap usage (not surprised somehow), but I wouldn't be comfortable with accepting such a huge change in this form without understanding the motivations and design of the improvements. That is unless @esmelov is willing to take over the maintenance. The final call for this PR rests with @enemaerke. I am happy to help review (even though I am short on time) and be part of the discussion if the PR can be restarted and broken down into distinct ones with least changes needed per PR. Changes like re-formatting and restructuring the project files is something that should be considered later and debated and justified entirely separately.

@tttrrryyyyeer
Copy link
Author

We've had a previous PR that actually pulled everything into a single file (supposedly for easy embedding into other projects) which this would break that behavior.
@atifaziz, I believe you introduced the single file back in the day. Any comments or objections to the overall change in this PR?

That's right. This was done in ac4c3fb and now we'd be going backward. Strictly speaking, though, this doesn't qualify as a behavioural change.

Embedding was one advantage as a side-effect of having a single file, but frankly, after refactoring to a simple design, we were down to just 361 lines of code, including 18 lines of the license header region! When it's that small, it doesn't make sense to spread types into separate files for the sake of it. It makes a simple and single-purpose project such as this one difficult to understand and navigate.

Later, UAParser.cs nearly doubled in size (at 668 lines) when @enemaerke added doc comments in a7d31ba but that's a separate point and value to debate.

I am sure @esmelov had all the good intentions with this PR but I have several problems with it:

  1. Too many things rolled into one PR. The subject line lists them all but I would add restructuring and refactoring too. You have to take the whole pie or nothing.
  2. Because of 1, there are so many aspects and files (30!) to review that there must have been an assumption that maintainers will have the time to tear apart everything and review in one sitting. I know I don't.
  3. The maintainers were never asked (by opening an issue) if there's a problem to solve, what changes (besides incidental complexity) would be acceptable.
  4. There is no problem statement that talks about the problem that this PR is solving. Was it performance? If so, what was the bottleneck or hot path?
  5. Code was refactored/restructured to a new design with no justification. You could say it was needed to improve the performance but there's no explanation of what gave better numbers.
  6. The code formatting/style was changed.

Again, I am sure @esmelov had all the good intentions and I commend him for wanting to contribute his ideas back to help improve the project but I think the PR is all over the place and sparse on descriptive details. There's good stuff in there I see like considerable reduction of heap usage (not surprised somehow), but I wouldn't be comfortable with accepting such a huge change in this form without understanding the motivations and design of the improvements. That is unless @esmelov is willing to take over the maintenance. The final call for this PR rests with @enemaerke. I am happy to help review (even though I am short on time) and be part of the discussion if the PR can be restarted and broken down into distinct ones with least changes needed per PR. Changes like re-formatting and restructuring the project files is something that should be considered later and debated and justified entirely separately.

Thank you for your feedback, @enemaerke @atifaziz ! I will try to explain why I divided it into classes. First, I was guided by the principles of OOP and SOLID, since this is more familiar to developers in C# than the functional style. Delegates are also additional allocations, and linq kills performance. The bottleneck was and still is working with string types. Read Only Span partially solves this problem.
Unfortunately, I did not know the reason for the location of all the code in a single file, and if this is critical, I can try to optimize it without changing the project structure.
I will be happy to help improve the project in the future.

@enemaerke
Copy link
Contributor

I am a little swamped at the moment, so give me a little time to take a deeper looking into this PR and provide some more detailed feedback.

@enemaerke
Copy link
Contributor

Okay, so I have carved out a little time to have a look at this in more detail bearing in mind the goal you described of your changes.

Here are my overall thoughts:

  • Really appreciate the interest and the work put into the library. Thanks!
  • Implementing a dedicated BenchMark is also a great idea for benchmarking various aspects in a consistent manner. Even more so when it is able to highlight perf benefits (less allocations)
  • From your description the less allocations are from a switch to less Func-based implementation so that is something that should be included going forward
  • Adding more framework targets? Sure, why not. It does bloat up the actual NuGet a little bit but I guess that is the only real drawback I can think of so we should keep that.
  • I would appreciate that the formatting is kept in it's original format (keeping the indentation etc)
  • I must admit that I am conflicted about the split into multiple classes/files. Might just be habit speaking but I had come to like the single file for this project (obviously not how I structure anything of a larger scope, but this feels pretty isolated). Not sure if the original idea (you can just copy it into your own project) has been used at all.
  • Both the split into multiple classes and the redesign away from Func and the introduction of abstractions makes it difficult to do a proper review of actual changes being made.

I appreciate the SOLID steps taken in introducing abstractions and I am also more comfortable in "standard" OOP than a more functional style so perhaps a logical next step would be to try to see what a single-file version of your changes would look like, keeping your changes but merging them into the original file (with original formatting). Hopefully this would make it more easy to review the actual changes and then we can decide if we should split it up into files.

Hope this makes sense

@jammycakes
Copy link

Regarding framework targets -- these do need to be kept up to date. Adding netstandard2.0 and netstandard2.1 should suffice. However, you don't need to include targets for both .NET Standard and .NET Core, since .NET Standard implicitly targets .NET Core as well. Also, since you're targeting netstandard1.0 you can also remove the net45 target because .NET 4.5 implements .NET Standard 1.0. Going forward, you would then only need to add targets to new .NET Standard versions as they come out.

Agreed that this pull request is doing too much though. It really needs to be split up into separate PRs which should be evaluated individually on their own merits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants