Add defaults for parameters #1293

yangdanny97 · 2025-07-27T02:42:42Z

This PR uses a semi-automated/LLM-assisted method to add a few hundred parameter defaults to pandas-stubs, replacing what was previously ....

There were 3 annotations that had to change because their type signatures are not compatible with the default.

The codemod that made these transformations is here: https://gist.github.com/yangdanny97/0d81e16f6374582413dd5bcf90d902fd

How it works is:

parse pandas-stubs and find the module, method/function, & parameter name for any params w/ default ...
parse the docstring of the method/function in pandas, and if the parameter's docstring contains "default" extract the part after that keyword
look up the type annotation in a mapping of the extracted snippet to type annotation
use libCST to apply the type annotation to the parameter

The main thing that needs careful reviewing is the default_type_map which maps the docstring snippet to the type annotation we replace the ... with. That part was generated with chatGPT's help (I gave it every snippet I got after step 2), and I commented out the mappings that seemed wrong or unclear. So, even though this was "LLM-assisted" the script is deterministic and will always give you the same thing each time, and the LLM is only involved in a one-time manual step.

yangdanny97 · 2025-07-27T02:47:19Z

pandas-stubs/core/indexes/timedeltas.pyi

-    start: TimedeltaConvertibleTypes = ...,
-    end: TimedeltaConvertibleTypes = ...,
-    periods: int | None = ...,
+    start: TimedeltaConvertibleTypes | None = None,


Docstring says the default is None so I had to change the annotation

https://github.com/pandas-dev/pandas/blob/e4a03b6e47a8ef9cd045902916289cbc976d3d33/pandas/core/indexes/timedeltas.py#L256-L259

Based on the docstring, this should probably be rewritten to have overloads:

Of the four parameters start, end, periods, and freq,
exactly three must be specified.

loicdiridollou · 2025-07-27T02:51:48Z

pandas-stubs/core/arrays/datetimelike.pyi

@@ -35,13 +35,22 @@ class TimelikeOps:
    def unit(self) -> TimeUnit: ...
    def as_unit(self, unit: TimeUnit) -> Self: ...
    def round(
-        self, freq, ambiguous: TimeAmbiguous = ..., nonexistent: TimeNonexistent = ...
+        self,


Any reason why your PR would change the formatting? I don't see a change in the argument.

I'm not sure what happened there, either libCST changed the whitespace (which is weird but it's happened before) or the pre-commit did.

As long as the pre-commit passes that's not a big deal, I was trying to find a change and kept comparing the line but could not find any!

loicdiridollou · 2025-07-27T02:53:02Z

Just for my own knowledge, what is adding the defaults instead of the ellipsis trying to achieve? Is it to help when looking at the docs in the IDE, for pyrefly, or something else?

yangdanny97 · 2025-07-27T02:58:29Z

pandas-stubs/core/frame.pyi

@@ -802,8 +808,8 @@ class DataFrame(NDFrame, OpsMixin, _GetItemHack):
        self,
        other: NDFrameT,
        join: AlignJoin = ...,
-        axis: Axis | None = ...,
-        level: Level | None = ...,
+        axis: Axis | None = None,


This annotation changed

So the kind of weird thing is, I can't actually find the docstring anywhere in the pandas repo. But at runtime pandas.core.frame.DataFrame.shift.__doc__ is:

Shift index by desired number of periods with an optional time `freq`. When `freq` is not passed, shift the index without realigning the data. If `freq` is passed (in this case, the index must be date or datetime, or it will raise a `NotImplementedError`), the index will be increased using the periods and the `freq`. `freq` can be inferred when specified as "infer" as long as either freq or inferred_freq attribute is set in the index. Parameters ---------- periods : int or Sequence Number of periods to shift. Can be positive or negative. If an iterable of ints, the data will be shifted once by each int. This is equivalent to shifting by one value at a time and concatenating all resulting frames. The resulting columns will have the shift suffixed to their column names. For multiple periods, axis must not be 1. freq : DateOffset, tseries.offsets, timedelta, or str, optional Offset to use from the tseries module or time rule (e.g. \'EOM\'). If `freq` is specified then the index values are shifted but the data is not realigned. That is, use `freq` if you would like to extend the index when shifting and preserve the original data. If `freq` is specified as "infer" then it will be inferred from the freq or inferred_freq attributes of the index. If neither of those attributes exist, a ValueError is thrown. axis : {0 or \'index\', 1 or \'columns\', None}, default None Shift direction. For `Series` this parameter is unused and defaults to 0. fill_value : object, optional The scalar value to use for newly introduced missing values. the default depends on the dtype of `self`. For numeric data, ``np.nan`` is used. For datetime, timedelta, or period data, etc. :attr:`NaT` is used. For extension dtypes, ``self.dtype.na_value`` is used. suffix : str, optional If str and periods is an iterable, this is added after the column name and before the shift value for each shifted column name. Returns ------- DataFrame Copy of input object, shifted.

It is here https://github.com/pandas-dev/pandas/blob/e4a03b6e47a8ef9cd045902916289cbc976d3d33/pandas/core/generic.py#L10123

Usually when the functions are not in the correct file for example pd.Series.shift does not exist in the series.py file, that often means that they are in the class it inherits from (NDFrame which sits in the generic.py file).

yangdanny97 · 2025-07-27T02:59:36Z

pandas-stubs/core/arrays/datetimelike.pyi

    ): ...
    def floor(
-        self, freq, ambiguous: TimeAmbiguous = ..., nonexistent: TimeNonexistent = ...
+        self,


IDK why the pre-commit did these whitespace changes

yangdanny97 · 2025-07-27T03:00:18Z

pandas-stubs/core/series.pyi

@@ -1133,7 +1141,7 @@ class Series(IndexOpsMixin[S1], NDFrame):
        self,
        periods: int | Sequence[int] = ...,
        freq: DateOffset | timedelta | _str | None = ...,
-        axis: Axis = ...,
+        axis: Axis | None = None,


this annotation changed

yangdanny97 · 2025-07-27T03:04:59Z

Just for my own knowledge, what is adding the defaults instead of the ellipsis trying to achieve? Is it to help when looking at the docs in the IDE, for pyrefly, or something else?

It's to help with looking at docs/signature in the IDE. This work is unrelated to Pyrefly.

Here's a thread on typeshed w/ some discussion on the pros and cons of defaults: python/typeshed#8988

The consensus seems to be that if the default is simple (like None, or some literal) then it's fine, but if it's complex we use .... Mypy's stub generator has been updated to emit stubs containing simple defaults.

The second reason I made the PR is just to test out this way of doing LLM-assisted codemods. I have another PR in the works that uses the same approach to add a few hundred types for un-annotated parameters and returns, but that one requires more manual filtering to make sure the generated types are good.

yangdanny97 added 3 commits July 26, 2025 22:22

add defaults for parameters

047e5d2

add more defaults

2e371b6

fix type errors

9617ee6

yangdanny97 commented Jul 27, 2025

View reviewed changes

loicdiridollou reviewed Jul 27, 2025

View reviewed changes

yangdanny97 commented Jul 27, 2025

View reviewed changes

yangdanny97 marked this pull request as ready for review July 27, 2025 03:05

add more defaults

e66384f

yangdanny97 marked this pull request as draft July 27, 2025 18:21

fix deprecated function

b544b4e

yangdanny97 marked this pull request as ready for review July 27, 2025 19:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add defaults for parameters #1293

Add defaults for parameters #1293

Uh oh!

yangdanny97 commented Jul 27, 2025 •

edited

Loading

Uh oh!

yangdanny97 Jul 27, 2025 •

edited

Loading

Uh oh!

loicdiridollou Jul 27, 2025

Uh oh!

yangdanny97 Jul 27, 2025

Uh oh!

loicdiridollou Jul 27, 2025

Uh oh!

loicdiridollou commented Jul 27, 2025

Uh oh!

yangdanny97 Jul 27, 2025

Uh oh!

yangdanny97 Jul 27, 2025 •

edited

Loading

Uh oh!

loicdiridollou Jul 27, 2025

Uh oh!

loicdiridollou Jul 27, 2025

Uh oh!

yangdanny97 Jul 27, 2025 •

edited

Loading

Uh oh!

yangdanny97 Jul 27, 2025

Uh oh!

yangdanny97 commented Jul 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Add defaults for parameters #1293

Are you sure you want to change the base?

Add defaults for parameters #1293

Uh oh!

Conversation

yangdanny97 commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yangdanny97 Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loicdiridollou Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

yangdanny97 Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

loicdiridollou Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

loicdiridollou commented Jul 27, 2025

Uh oh!

yangdanny97 Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

yangdanny97 Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loicdiridollou Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

loicdiridollou Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

yangdanny97 Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yangdanny97 Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

yangdanny97 commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

yangdanny97 commented Jul 27, 2025 •

edited

Loading

yangdanny97 Jul 27, 2025 •

edited

Loading

yangdanny97 Jul 27, 2025 •

edited

Loading

yangdanny97 Jul 27, 2025 •

edited

Loading

yangdanny97 commented Jul 27, 2025 •

edited

Loading