Skip to content

Commit 3cae38b

Browse files
HyukjinKwonzhengruifeng
authored andcommitted
[SPARK-43612][PYTHON][CONNECT][FOLLOW-UP] Copy dependent data files to data directory
### What changes were proposed in this pull request? This PR proposes to move several data files used for PySpark artifact tests from `connector/connect/common/src/test/resources/artifact-tests`, added in apache#40368, to `data` directory. ### Why are the changes needed? PySpark tests should better not depend on Spark's test package build. This PR decouples it. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? CI in this PR should verify it. Closes apache#41510 from HyukjinKwon/SPARK-43612-followup. Authored-by: Hyukjin Kwon <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
1 parent f1cca85 commit 3cae38b

File tree

6 files changed

+19
-3
lines changed

6 files changed

+19
-3
lines changed

data/artifact-tests/crc/README.md

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
The CRCs for a specific file are stored in a text file with the same name (excluding the original extension).
2+
3+
The CRCs are calculated for data chunks of `32768 bytes` (individual CRCs) and are newline delimited.
4+
5+
The CRCs were calculated using https://simplycalc.com/crc32-file.php
+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
902183889
2+
2415704507
3+
1084811487
4+
1951510
5+
1158852476
6+
2003120166
7+
3026803842
8+
3850244775
9+
3409267044
10+
652109216
11+
104029242
12+
3019434266

data/artifact-tests/crc/smallJar.txt

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
1631702900

data/artifact-tests/junitLargeJar.jar

376 KB
Binary file not shown.

data/artifact-tests/smallJar.jar

787 Bytes
Binary file not shown.

python/pyspark/sql/tests/connect/client/test_artifact.py

+1-3
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,7 @@ class ArtifactTests(ReusedConnectTestCase):
3333
def setUpClass(cls):
3434
super(ArtifactTests, cls).setUpClass()
3535
cls.artifact_manager: ArtifactManager = cls.spark._client._artifact_manager
36-
cls.base_resource_dir = os.path.join(
37-
SPARK_HOME, "connector", "connect", "common", "src", "test", "resources"
38-
)
36+
cls.base_resource_dir = os.path.join(SPARK_HOME, "data")
3937
cls.artifact_file_path = os.path.join(
4038
cls.base_resource_dir,
4139
"artifact-tests",

0 commit comments

Comments
 (0)