Skip to content

Conversation

@coturiv
Copy link
Member

@coturiv coturiv commented Nov 26, 2025

  • v4.0.0rc1
  • Update test data
  • Fix convention issues
  • Remove npm list from travis
  • Load string comparison algorithm from config
  • Update README.md
  • Update README
  • v4.0.0rc2
  • v4.0.0rc3
  • v4.0.0rc4
  • v4.0.0rc5
  • v4.0.0
  • Add more tests
  • Update README.md
  • Update README.md
  • Update HTML filter rules
  • v4.0.1
  • Fix some stuff
  • v4.0.2
  • Update README.md
  • v4.0.4
  • Update rules
  • Update rules
  • v4.0.7
  • Move constants to config
  • Update travis
  • Downgrade tap
  • Update README.md
  • Update README.md
  • Update README.md
  • Update README.md
  • Update package.json
  • v4.0.9
  • v4.1.0
  • v4.1.1
  • Allow empty tags in setSanitizeHtmlOptions allowedTags
  • v4.1.2
  • Change nock version
  • Update deps, travis config
  • Replace es6-readability with readabilitySAX
  • Refactor to use commonjs syntax instead of depending on esm module
  • v4.2.0
  • v4.2.1
  • Update README.md
  • count words in html block to qualify as article
  • Fix coding style from pr count words in html block to qualify as article extractus/article-extractor#115
  • Update dependencies
  • v4.2.3
  • Change version to v4.2.3
  • Update dependencies
  • Remove pnpm-lock.yaml from source
  • Change version to v4.2.4
  • Add types definition
  • Update travis config
  • Update badge links
  • Fix potential issues
  • Switch to GitHub actions
  • Update README.md
  • Update ci-test.yml
  • Update package.json
  • Update package.json
  • Update ci-test.yml
  • Remove SonarCloud job
  • v4.2.4rc2
  • Add github token
  • Move coverall to other job
  • Add check files line
  • Modify coverall block
  • Add block npm script
  • v4.2.4
  • Add lang to HTML file
  • fixes html parse error
  • v4.2.6
  • Fix problem with defence-blog.com
  • Add rule to parse aljazeera.com articles
  • Fix linting issue
  • v4.2.7 - Bump version
  • Add test script for custom user agent
  • Improve rule based extracting tool
  • v4.2.8 - Update dependencies
  • v4.2.8 - Add rule for people.com
  • Replace html-minifier with html-minifier-terser
  • v4.2.8
  • Add node 16.x to test
  • Support memory.ai/timely-blog articles
  • v4.2.9
  • Replace lazy pictures with original ones
  • v4.2.10
  • Update test script
  • Change coding style to standardjs
  • Add standardjs badge
  • v4.3.0
  • Fix coverall github action version
  • Fix SonarCloud security warning
  • Update test script
  • v4.2.12
  • v4.2.12
  • v4.2.13
  • v4.3.0
  • Change default user-agent string to Firefox
  • Update test script
  • Remove test with node < 14
  • v5.0.0rc1
  • v5.0.0rc2
  • v5.0.0rc3
  • v5.0.0rc4
  • Update README
  • Update README
  • Update test lines indentation
  • v5.0.0rc5
  • Update README.md
  • v5.0.0rc6
  • Fix coverage percent
  • v5.0.0
  • v5.0.1
  • Update README.md
  • v6.0.0rc1
  • Switch to es6 module format
  • v5.0.2
  • Add missing items for 5.0
  • v5.0.3
  • Add missing items for 5.0
  • fix: typo
  • feat: npx for standard & cross-env for windows
  • fix(build): mark *.node and ./xhr-sync-worker.js as external... use fs for file operate add type for config
  • chore: update dependencies
  • chore(build): minify but generate sourcemap
  • chore(build): add canvas to external
  • v6.0.0rc1
  • chore: update dependencies
  • feat(type): clarify types & fix typo
  • feat(type): add description of extract
  • feat: replace jsdom with linkedom... which is "lighter"
  • feat(type): clarify types & fix typo
  • feat: replace jsdom with linkedom... which is "lighter"
  • feat(build): add @types/sanitize-html
  • chore: update bellajs
  • v5.0.4
  • v6.0.0rc1
  • v6.0.0rc1
  • chore: add eval:cjs task & remove unused external
  • fix: use github for string-comparison
  • v6.0.0rc2
  • feat: cheerio to linkedom
  • feat: update jest config for esm
  • feat: switch to html-crush for better compatibility
  • v6.0.0rc3
  • chore: isValidUrl called in parseFromHtml.js
  • fix: lack of QueryRule.transform
  • feat: rules ascending
  • chore: refract best url logical
  • chore: remove unused comma
  • fix: arg name for document
  • feat: add browser bundle
  • fix: use Array.from for HTMLCollection
  • feat: absolutify image src
  • fix: Readability with right base url
  • fix: typo
  • fix: use base element for baseURI
  • fix: linkedom difference with browser... add relative image test
  • chore: lint
  • feat: update for Document.baseURI always undefined WebReflection/linkedom#117
  • feat: update string-comparison for ES6 Import doesn't work for me extractus/article-extractor#206 (comment)
  • feat: update string-comparison import
  • chore: remove logging
  • v6.0.0rc4
  • feat(ci): add action for publish to npm... when release
  • v6.0.0rc5
  • feat(ci): add package lock file for npm ci
  • feat(ci): add npmignore for npm publish
  • chore(build): revert push dist
  • feat: use URLPattern
  • chore: add TODO for Break the support of exports field makotoshimazu/jest-module-field-resolver#2
  • v6.0.0rc6
  • v6.0.0rc7
  • feat: use Readability to fetch title Readability will fallback to title tag when can't find
  • v6.0.0rc8
  • feat: schema datepublished Add schema.org datePublished metadata extractus/article-extractor#237
  • fix(doc): transform use linkedom now
  • Update src/utils/extractMetaData.js
  • fix: allow unwanted without selector
  • Update src/utils/extractWithSelector.js
  • Update src/utils/extractWithSelector.js
  • Update src/utils/extractWithSelector.js
  • feat: support for URLPattern(string) constructor
  • feat: support set of rules
  • chore: lint
  • feat: types of rules getter and setter
  • v6.0.0rc9
  • Remove CI npm publish script
  • Readd dist folder
  • fix unwanted without selector
  • export parseHTML linkedom
  • removed stripUnwantedTags try-catch block
  • Update src/utils/stripUnwantedTags.js
  • removed parseHTML reference
  • v6.0.0rc10
  • feat: mutable html crush options (feat: mutable html crush options extractus/article-extractor#253)
  • chore: update linkedom (chore: update linkedom extractus/article-extractor#254)
  • chore: ignore .DS_Store config files created by MacOs
  • chore: add prettier config to enforce code style
  • chore: implement tests for new methods
  • chore: create new methods split from standardizeArticle
  • chore: add new methods to parseFromHtml and delete unused util
  • chore: update dist with latest build
  • chore: update param documentation
  • chore: update dependencies (chore: update dependencies extractus/article-extractor#257)
  • v6.0.0
  • v6.0.1
  • fix: can't fetch html from document on browser
  • v6.0.2
  • v6.0.2
  • chore: update urlpattern-polyfill
  • v6.0.3
  • v6.0.3 - Rebuild
  • v6.0.4
  • v6.0.4
  • v6.0.4
  • v6.0.4
  • Update README
  • Update README
  • v6.0.5
  • v6.0.5
  • v6.0.6
  • v7.0.0rc1
  • Update README.md
  • v7.0.0rc2
  • v7.0.0rc3
  • v7.0.0rc3
  • Change method to deal with source and description
  • v7.0.0
  • v7.0.1
  • Update README
  • v7.0.2
  • v7.0.2
  • v7.0.3
  • v7.1.0 - To work with bun and deno
  • Update types definition
  • v7.1.1
  • v7.1.1
  • v7.2.0rc1
  • Update README refer links
  • v7.2.0rc2 - Rebuild
  • Update README
  • v7.2.0rc3
  • v7.2.0rc4
  • v7.2.0rc5
  • Add examples with node, deno, bun, tsnode
  • Remove bun.lockb
  • Rebuild
  • v7.2.0
  • Update examples
  • v7.2.1-rc1
  • v7.2.1
  • v7.2.2-rc1
  • Update dependencies
  • Update README
  • v7.2.2-rc2
  • v7.2.2
  • v7.2.3
  • Update README
  • Add option to keep/remove line breaks
  • v7.2.4
  • v7.2.5
  • Update README
  • Add more specs for meta data extraction
  • Add security policy
  • Add ci test with node 19.x
  • Update security policy.
  • Update security contact
  • Add contributing guide
  • Update README
  • Update SECURITY.md
  • v7.2.6 - Migrate to extractus org
  • Update README
  • Update coveralls github action
  • v7.2.7
  • Update CI settings
  • Update CI config
  • Fix CI settings
  • Update CI settings
  • Update README
  • Add image to docs
  • Update README
  • v7.2.8
  • Update README
  • v7.2.9
  • v7.2.10
  • Add null to response types
  • v7.2.11
  • v7.2.12
  • Update ci config
  • v7.2.13rc1
  • v7.2.13
  • Rebuild v7.2.13
  • v7.2.14
  • Change string array to dictionary
  • v7.2.15
  • v7.2.15
  • v7.2.16
  • Add favicon to meta data
  • ** GNU nano 6.4 /workspace/node/article-extractor/.git/COMMIT_EDITMSG Modified v7.2.17**
  • v7.2.17
  • v7.2.17
  • v7.2.17
  • v7.2.18
  • v7.3.0
  • Update README
  • v8.0.0 - Bump version
  • Update README
  • Update README
  • v8.0.1
  • Update dependencies
  • Use childNodes instead of children
  • Update README
  • Fix ParserOptions typing
  • v8.0.3
  • Stop ci test with node < 16 because EOL
  • Feat: extract pagetype from og:type or ld+json
  • v8.0.8
  • Update examples
  • v8.0.5
  • Fix CI issue with coverall
  • Fix CI issue
  • Fix CI problem
  • Change ci event
  • Update CI event
  • Fix CI problem
  • Fix CI issue
  • Fix CI coverall
  • v8.0.6
  • v8.0.7
  • v8.0.8
  • Add node 22 to ci
  • Update examples & test with pupperteer
  • v8.0.9
  • v8.0.10
  • chore: Improvements in handling LD+JSON data
  • v8.0.11
  • Add test coverage
  • fix: Cannot read properties of undefined in ld+json
  • fix: more tests on ld+json
  • v8.0.12
  • Improvements to find dates
  • v8.0.13
  • v8.0.14
  • fix: adjustment of poorly formatted ldjson error
  • v8.0.15
  • v8.0.16
  • v8.0.17
  • Update eval script
  • 8.0.18
  • Update README
  • Update README
  • v8.0.19
  • Add test with node 24
  • v8.0.20 - Update dependencies
  • Remove examples
  • v8.0.20 - Update packages

daveschumaker and others added 30 commits April 1, 2022 13:51
- Change to ES6 Module format
- Change and update dependencies
  - Also update core logic

Related pr: extractus#219, extractus#220, extractus#222, extractus#224, extractus#227, extractus#228, extractus#232, extractus#238, extractus#240, extractus#241 extractus#243, extractus#245
- Change code analysis to GitHub CodeQuality
- Update dependencies
fix: can't fetch html from document on browser
- Merge pr extractus#265 by @SettingDust (related issue extractus#264)
- Update dependencies
- Rebuild
…ern_polyfill

chore: update `urlpattern-polyfill`
- Merge pr extractus#269 by @SettingDust (issue extractus#266)
- Fix coding style
- Update more parser config
- Improve README & fix expired API key for example
- Improve README
- Add more test
- Improve README
- Improve README
- Fix link to default rules
andremacola and others added 28 commits October 18, 2024 05:25
- Fix inconsistent output (extractus#407)
- Modify some stuff at LdJson extraction (extractus#405)
  - Only use value from LdJson if missed from meta tags
  - Only accept string value from LdJson
  - Stop converting LdJson value to lowercase
fix: adjustment of poorly formatted ldjson error
- Fix issue extractus#412
- Update dependencies
- Update dependencies
- Update dependencies
- Update CI config
- Update README
- Fix image lossing while ldjson overwrite meta data
- Update dependencies
- To stop dependencies outdated warning
@coturiv coturiv closed this Nov 26, 2025
@coturiv coturiv deleted the merge-extractus-main branch November 26, 2025 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants