Skip to content

parquet-hadoop tests fail if working behind a web proxy #3482

@alexeyroytman

Description

@alexeyroytman

Describe the enhancement requested

My working environment is behind a web proxy. The maven build for me fails on parquet-hadoop tests that use files from https://github.com which is not directly accessible to me.

The failed tests are:

[ERROR]   TestInterOpReadByteStreamSplit.testReadAllSupportedTypes » SocketTimeout Connect timed out
[ERROR]   TestInterOpReadByteStreamSplit.testReadFloats » SocketTimeout Connect timed out
[ERROR]   TestInterOpReadFloat16.testInterOpReadFloat16NonZerosAndNansParquetFiles » SocketTimeout Connect timed out
[ERROR]   TestInterOpReadFloat16.testInterOpReadFloat16ZerosAndNansParquetFiles » SocketTimeout Connect timed out
[ERROR]   TestInteropBloomFilter.testReadDataIndexBloomParquetFiles » SocketTimeout Connect timed out
[ERROR]   TestInteropBloomFilter.testReadDataIndexBloomWithLengthParquetFiles » SocketTimeout Connect timed out
[ERROR]   TestInteropReadLz4RawCodec.testInteropReadLz4RawLargerParquetFiles » SocketTimeout Connect timed out
[ERROR]   TestInteropReadLz4RawCodec.testInteropReadLz4RawSimpleParquetFiles » SocketTimeout Connect timed out

They all are similar:

org.apache.parquet.hadoop.TestInterOpReadByteStreamSplit.testReadAllSupportedTypes -- Time elapsed: 10.24 s <<< ERROR!
java.net.SocketTimeoutException: Connect timed out
        at java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:551)
        at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602)
        at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
        at java.base/java.net.Socket.connect(Socket.java:633)
        at okhttp3.internal.platform.Platform.connectSocket(Platform.kt:128)
        at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.kt:295)
        at okhttp3.internal.connection.RealConnection.connect(RealConnection.kt:207)
        at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.kt:226)

The main reason for this is that OkHttpClient for these tests is not configured to use any proxy, and it does not recognize system properties (e.g. https.proxyHost) or environment variables (e.g. https_proxy) for such flow.

Component(s)

Build

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions