-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Description
From the project README - CSV source part I got the idea that type conversion for loaded CSV should be performed according to the specified schema.
But if I define a custom schema for a CsvSource
which has columns with other types than String
(Int
for example), then the values in that column are still returned as String
.
Is it intended behaviour, bug or it just haven't been implemented?
Runnable example:
import java.io.ByteArrayInputStream
import java.nio.charset.StandardCharsets
import io.eels.component.csv.CsvSource
import io.eels.schema._
object CsvSourceTypeConversionTest extends App {
val exampleCsvString =
"""A,B,C,D
|1,2.2,3,foo
|4,5.5,6,bar
""".stripMargin
val stream = new ByteArrayInputStream(exampleCsvString.getBytes(StandardCharsets.UTF_8))
val schema = new StructType(Vector(
Field("A", IntType.Signed),
Field("B", DoubleType),
Field("C", IntType.Signed),
Field("D", StringType)
))
val ds = new CsvSource(stream _, Some(schema)).toDataStream()
val firstRow = ds.iterator.toIterable.head
val firstRowA = firstRow.get("A")
println(firstRowA) // prints 1 as expected
println(firstRowA.getClass.getTypeName) // prints java.lang.String
assert(firstRowA == 1) // this assertion will fail because firstRowA is not an Int
}
Metadata
Metadata
Assignees
Labels
No labels