Skip to content

docs: fit transform addition#16

Closed
davidbp wants to merge 1 commit intoJuliaAI:devfrom
davidbp:docs-fit-transform
Closed

docs: fit transform addition#16
davidbp wants to merge 1 commit intoJuliaAI:devfrom
davidbp:docs-fit-transform

Conversation

@davidbp
Copy link

@davidbp davidbp commented Jan 31, 2023

I am missing a method that might allow fitting a transformation and returning the transformed result in a single call.

Here there is an example from sklearn that shows the usage of fit_transform

>>> from sklearn.feature_extraction.text import CountVectorizer
>>> corpus = [
...     'This is the first document.',
...     'This document is the second document.',
...     'And this is the third one.',
...     'Is this the first document?',
... ]
>>> vectorizer = CountVectorizer()
>>> X = vectorizer.fit_transform(corpus)
>>> vectorizer.get_feature_names_out()
array(['and', 'document', 'first', 'is', 'one', 'second', 'the', 'third',
       'this'], ...)
>>> print(X.toarray())
[[0 1 1 1 0 0 1 0 1]
 [0 2 0 1 0 1 1 0 1]
 [1 0 0 1 1 0 1 1 1]
 [0 1 1 1 0 0 1 0 1]]

Note that you could just think that fit_transform simply has fit and transform inside as two function calls, but this would require iterating over the data twice (one for each function call).

One benefit of fit_transform is that it can iterate only once over the data and generate the transformed data while it is iterating over it.

If no specific efficient fit_transform is implemented it could be just sintactic sugar for calling fit!(transformer, X) and then transform(transformer, X)

@ablaom
Copy link
Member

ablaom commented Mar 2, 2023

Sorry for my late response, and thanks for bringing this up.

Yes, I understand that there is a use-case for fitting and transforming in one go; as you say one can avoid extra computation/allocation.

Currently my head is around reducing methods as much as possible, and so I will come back to this but keep it in mind.

@ablaom ablaom mentioned this pull request Mar 2, 2023
@ablaom
Copy link
Member

ablaom commented Mar 2, 2023

closing in favour of #18

@ablaom ablaom closed this Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments