spark-2.0.1版本的最新改动
Sub-task(子任务)
- [SPARK-15232] - Add subquery SQL building tests to LogicalPlanToSQLSuite
- [SPARK-15698] - Ability to remove old metadata for structure streaming MetadataLog
- [SPARK-15814] - Aggregator can return null result
- [SPARK-16287] - Implement str_to_map SQL function
- [SPARK-16312] - Docs for Kafka 0.10 consumer integration
- [SPARK-16380] - Update SQL examples and programming guide for Python language binding
- [SPARK-16391] - KeyValueGroupedDataset.reduceGroups should support partial aggregation
- [SPARK-16508] - Fix documentation warnings found by R CMD check
- [SPARK-16510] - Move SparkR test JAR into Spark, include its source code
- [SPARK-16519] - Handle SparkR RDD generics that create warnings in R CMD check
- [SPARK-16577] - Add check-cran script to Jenkins
- [SPARK-16579] - Add a spark install function
- [SPARK-16581] - Making JVM backend calling functions public
- [SPARK-16621] - Generate stable SQLs in SQLBuilder
- [SPARK-16734] - Make sure examples in all language bindings are consistent
- [SPARK-16735] - Fail to create a map contains decimal type with literals having different inferred precessions and scales
- [SPARK-16774] - Fix use of deprecated TimeStamp constructor (also providing incorrect results)
- [SPARK-16776] - Fix Kafka deprecation warnings
- [SPARK-16778] - Fix use of deprecated SQLContext constructor
- [SPARK-16800] - Fix Java Examples that throw exception
- [SPARK-16866] - Basic infrastructure for file-based SQL end-to-end tests
- [SPARK-17007] - Move test data files into a test-data folder
- [SPARK-17008] - Normalize query results using sorting
- [SPARK-17009] - Use a new SparkSession for each test case
- [SPARK-17011] - Support testing exceptions in queries
- [SPARK-17015] - group-by-ordinal and order-by-ordinal test cases
- [SPARK-17018] - literals.sql for testing literal parsing
- [SPARK-17042] - Repl-defined classes cannot be replicated
- [SPARK-17096] - Fix StreamingQueryListener to return message and stacktrace of actual exception
- [SPARK-17149] - array.sql for testing array related functions
- [SPARK-17165] - FileStreamSource should not track the list of seen files indefinitely
- [SPARK-17235] - MetadataLog should support purging old logs
- [SPARK-17269] - Move finish analysis stage into its own file
- [SPARK-17270] - Move object optimization rules into its own file
- [SPARK-17274] - Move join optimizer rules into a separate file
- [SPARK-17372] - Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError
- [SPARK-17513] - StreamExecution should discard unneeded metadata
- [SPARK-17586] - Use Static member not via instance reference
- [SPARK-18151] - CLONE - MetadataLog should support purging old logs
- [SPARK-18152] - CLONE - FileStreamSource should not track the list of seen files indefinitely
- [SPARK-18153] - CLONE - Ability to remove old metadata for structure streaming MetadataLog
- [SPARK-18156] - CLONE - StreamExecution should discard unneeded metadata
Bug
- [SPARK-10683] - Source code missing for SparkR test JAR
- [SPARK-11227] - Spark1.5+ HDFS HA mode throw java.net.UnknownHostException: nameservice1
- [SPARK-12666] - spark-shell --packages cannot load artifacts which are publishLocal'd by SBT
- [SPARK-14204] - [SQL] Failure to register URL-derived JDBC driver on executors in cluster mode
- [SPARK-14209] - Application failure during preemption.
- [SPARK-14818] - Move sketch and mllibLocal out from mima exclusion
- [SPARK-15083] - History Server would OOM due to unlimited TaskUIData in some stages
- [SPARK-15285] - Generated SpecificSafeProjection.apply method grows beyond 64 KB
- [SPARK-15382] - monotonicallyIncreasingId doesn't work when data is upsampled
- [SPARK-15390] - Memory management issue in complex DataFrame join and filter
- [SPARK-15541] - SparkContext.stop throws error
- [SPARK-15869] - HTTP 500 and NPE on streaming batch details page
- [SPARK-15899] - file scheme should be used correctly
- [SPARK-15989] - PySpark SQL python-only UDTs don't support nested types
- [SPARK-16062] - PySpark SQL python-only UDTs don't work well
- [SPARK-16321] - [Spark 2.0] Performance regression when reading parquet and using PPD and non-vectorized reader
- [SPARK-16334] - SQL query on parquet table java.lang.ArrayIndexOutOfBoundsException
- [SPARK-16409] - regexp_extract with optional groups causes NPE
- [SPARK-16439] - Incorrect information in SQL Query details
- [SPARK-16440] - Undeleted broadcast variables in Word2Vec causing OoM for long runs
- [SPARK-16457] - Wrong messages when CTAS with a Partition By clause
- [SPARK-16460] - Spark 2.0 CSV ignores NULL value in Date format
- [SPARK-16462] - Spark 2.0 CSV does not cast null values to certain data types properly
- [SPARK-16522] - [MESOS] Spark application throws exception on exit
- [SPARK-16533] - Spark application not handling preemption messages
- [SPARK-16550] - Caching data with replication doesn't replicate data
- [SPARK-16558] - examples/mllib/LDAExample should use MLVector instead of MLlib Vector
- [SPARK-16563] - Repeat calling Spark SQL thrift server fetchResults return empty for ExecuteStatement operation
- [SPARK-16586] - spark-class crash with "[: too many arguments" instead of displaying the correct error message
- [SPARK-16597] - DataFrame DateType is written as an int(Days since epoch) by csv writer
- [SPARK-16610] - When writing ORC files, orc.compress should not be overridden if users do not set "compression" in the options
- [SPARK-16613] - RDD.pipe returns values for empty partitions
- [SPARK-16632] - Vectorized parquet reader fails to read certain fields from Hive tables
- [SPARK-16633] - lag/lead using constant input values does not return the default value when the offset row does not exist
- [SPARK-16634] - GenericArrayData can't be loaded in certain JVMs
- [SPARK-16639] - query fails if having condition contains grouping column
- [SPARK-16642] - ResolveWindowFrame should not be triggered on UnresolvedFunctions.
- [SPARK-16644] - constraints propagation may fail the query
- [SPARK-16646] - LEAST doesn't accept numeric arguments with different data types
- [SPARK-16648] - LAST_VALUE(FALSE) OVER () throws IndexOutOfBoundsException
- [SPARK-16656] - CreateTableAsSelectSuite is flaky
- [SPARK-16664] - Spark 1.6.2 - Persist call on Data frames with more than 200 columns is wiping out the data.
- [SPARK-16672] - SQLBuilder should not raise exceptions on EXISTS queries
- [SPARK-16686] - Dataset.sample with seed: result seems to depend on downstream usage
- [SPARK-16698] - json parsing regression - "." in keys
- [SPARK-16699] - Fix performance bug in hash aggregate on long string keys
- [SPARK-16700] - StructType doesn't accept Python dicts anymore
- [SPARK-16703] - Extra space in WindowSpecDefinition SQL representation
- [SPARK-16711] - YarnShuffleService doesn't re-init properly on YARN rolling upgrade
- [SPARK-16714] - Fail to create a decimal arrays with literals having different inferred precessions and scales
- [SPARK-16715] - Fix a potential ExprId conflict for SubexpressionEliminationSuite."Semantic equals and hash"
- [SPARK-16721] - Lead/lag needs to respect nulls
- [SPARK-16724] - Expose DefinedByConstructorParams
- [SPARK-16729] - Spark should throw analysis exception for invalid casts to date type
- [SPARK-16730] - Spark 2.0 breaks various Hive cast functions
- [SPARK-16740] - joins.LongToUnsafeRowMap crashes with NegativeArraySizeException
- [SPARK-16748] - Errors thrown by UDFs cause TreeNodeException when the query has an ORDER BY clause
- [SPARK-16750] - ML GaussianMixture training failed due to feature column type mistake
- [SPARK-16751] - Upgrade derby to 10.12.1.1 from 10.11.1.1
- [SPARK-16770] - Spark shell not usable with german keyboard due to JLine version
- [SPARK-16781] - java launched by PySpark as gateway may not be the same java used in the spark environment
- [SPARK-16785] - dapply doesn't return array or raw columns
- [SPARK-16787] - SparkContext.addFile() should not fail if called twice with the same file
- [SPARK-16791] - casting structs fails on Timestamp fields (interpreted mode only)
- [SPARK-16802] - joins.LongToUnsafeRowMap crashes with ArrayIndexOutOfBoundsException
- [SPARK-16818] - Exchange reuse incorrectly reuses scans over different sets of partitions
- [SPARK-16831] - CrossValidator reports incorrect avgMetrics
- [SPARK-16836] - Hive date/time function error
- [SPARK-16837] - TimeWindow incorrectly drops slideDuration in constructors
- [SPARK-16850] - Improve error message for greatest/least
- [SPARK-16873] - force spill NPE
- [SPARK-16880] - Improve ANN training, add training data persist if needed
- [SPARK-16883] - SQL decimal type is not properly cast to number when collecting SparkDataFrame
- [SPARK-16901] - Hive settings in hive-site.xml may be overridden by Hive's default values
- [SPARK-16905] - Support SQL DDL: MSCK REPAIR TABLE
- [SPARK-16907] - Parquet table reading performance regression when vectorized record reader is not used
- [SPARK-16922] - Query with Broadcast Hash join fails due to executor OOM in Spark 2.0
- [SPARK-16925] - Spark tasks which cause JVM to exit with a zero exit code may cause app to hang in Standalone mode
- [SPARK-16926] - Partition columns are present in columns metadata for partition but not table
- [SPARK-16936] - Case Sensitivity Support for Refresh Temp Table
- [SPARK-16942] - CREATE TABLE LIKE generates External table when source table is an External Hive Serde table
- [SPARK-16943] - CREATE TABLE LIKE generates a non-empty table when source is a data source table
- [SPARK-16950] - fromOffsets parameter in Kafka's Direct Streams does not work in python3
- [SPARK-16953] - Make requestTotalExecutors public to be consistent with requestExecutors/killExecutors
- [SPARK-16955] - Using ordinals in ORDER BY causes an analysis error when the query has a GROUP BY clause using ordinals
- [SPARK-16959] - Table Comment in the CatalogTable returned from HiveMetastore is Always Empty
- [SPARK-16961] - Utils.randomizeInPlace does not shuffle arrays uniformly
- [SPARK-16966] - App Name is a randomUUID even when "spark.app.name" exists
- [SPARK-16975] - Spark-2.0.0 unable to infer schema for parquet data written by Spark-1.6.2
- [SPARK-16991] - Full outer join followed by inner join produces wrong results
- [SPARK-16994] - Filter and limit are illegally permuted.
- [SPARK-16995] - TreeNodeException when flat mapping RelationalGroupedDataset created from DataFrame containing a column created with lit/expr
- [SPARK-17010] - [MINOR]Wrong description in memory management document
- [SPARK-17013] - negative numeric literal parsing
- [SPARK-17016] - group-by/order-by ordinal should throw AnalysisException instead of UnresolvedException
- [SPARK-17022] - Potential deadlock in driver handling message
- [SPARK-17027] - PolynomialExpansion.choose is prone to integer overflow
- [SPARK-17038] - StreamingSource reports metrics for lastCompletedBatch instead of lastReceivedBatch
- [SPARK-17051] - we should use hadoopConf in InsertIntoHiveTable
- [SPARK-17056] - Fix a wrong assert in MemoryStore
- [SPARK-17061] - Incorrect results returned following a join of two datasets and a map step where total number of columns >100
- [SPARK-17065] - Improve the error message when encountering an incompatible DataSourceRegister
- [SPARK-17066] - dateFormat should be used when writing dataframes as csv files
- [SPARK-17086] - QuantileDiscretizer throws InvalidArgumentException (parameter splits given invalid value) on valid data
- [SPARK-17093] - Roundtrip encoding of array<struct<>> fields is wrong when whole-stage codegen is disabled
- [SPARK-17098] - "SELECT COUNT(NULL) OVER ()" throws UnsupportedOperationException during analysis
- [SPARK-17099] - Incorrect result when HAVING clause is added to group by query
- [SPARK-17100] - pyspark filter on a udf column after join gives java.lang.UnsupportedOperationException
- [SPARK-17104] - LogicalRelation.newInstance should follow the semantics of MultiInstanceRelation
- [SPARK-17110] - Pyspark with locality ANY throw java.io.StreamCorruptedException
- [SPARK-17113] - Job failure due to Executor OOM in offheap mode
- [SPARK-17114] - Adding a 'GROUP BY 1' where first column is literal results in wrong answer
- [SPARK-17115] - Improve the performance of UnsafeProjection for wide table
- [SPARK-17117] - 'SELECT 1 / NULL` throws AnalysisException, while 'SELECT 1 * NULL` works
- [SPARK-17120] - Analyzer incorrectly optimizes plan to empty LocalRelation
- [SPARK-17124] - RelationalGroupedDataset.agg should be order preserving and allow duplicate column names
- [SPARK-17158] - Improve error message for numeric literal parsing
- [SPARK-17160] - GetExternalRowField does not properly escape field names, causing generated code not to compile
- [SPARK-17162] - Range does not support SQL generation
- [SPARK-17167] - Issue Exceptions when Analyze Table on In-Memory Cataloged Tables
- [SPARK-17180] - Unable to Alter the Temporary View Using ALTER VIEW command
- [SPARK-17182] - CollectList and CollectSet should be marked as non-deterministic
- [SPARK-17194] - When emitting SQL for string literals Spark should use single quotes, not double
- [SPARK-17205] - Literal.sql does not properly convert NaN and Infinity literals
- [SPARK-17210] - sparkr.zip is not distributed to executors when run sparkr in RStudio
- [SPARK-17211] - Broadcast join produces incorrect results when compressed Oops differs between driver, executor
- [SPARK-17216] - Even timeline for a stage doesn't core 100% of the bar timeline bar in chrome
- [SPARK-17228] - Not infer/propagate non-deterministic constraints
- [SPARK-17230] - Writing decimal to csv will result empty string if the decimal exceeds (20, 18)
- [SPARK-17243] - Spark 2.0 history server summary page gets stuck at "loading history summary" with 10K+ application history
- [SPARK-17244] - Joins should not pushdown non-deterministic conditions
- [SPARK-17252] - Performing arithmetic in VALUES can lead to ClassCastException / MatchErrors during query parsing
- [SPARK-17253] - Left join where ON clause does not reference the right table produces analysis error
- [SPARK-17261] - Using HiveContext after re-creating SparkContext in Spark 2.0 throws "Java.lang.illegalStateException: Cannot call methods on a stopped sparkContext"
- [SPARK-17264] - DataStreamWriter should document that it only supports Parquet for now
- [SPARK-17296] - Spark SQL: cross join + two joins = BUG
- [SPARK-17299] - TRIM/LTRIM/RTRIM strips characters other than spaces
- [SPARK-17306] - QuantileSummaries doesn't compress
- [SPARK-17309] - ALTER VIEW should throw exception if view not exist
- [SPARK-17323] - ALTER VIEW AS should keep the previous table properties, comment, create_time, etc.
- [SPARK-17335] - Creating Hive table from Spark data
- [SPARK-17336] - Repeated calls sbin/spark-config.sh file Causes ${PYTHONPATH} Value duplicate
- [SPARK-17339] - Fix SparkR tests on Windows
- [SPARK-17342] - Style of event timeline is broken
- [SPARK-17352] - Executor computing time can be negative-number because of calculation error
- [SPARK-17353] - CREATE TABLE LIKE statements when Source is a VIEW
- [SPARK-17354] - java.lang.ClassCastException: java.lang.Integer cannot be cast to java.sql.Date
- [SPARK-17355] - Work around exception thrown by HiveResultSetMetaData.isSigned
- [SPARK-17356] - A large Metadata filed in Alias can cause OOM when calling TreeNode.toJSON
- [SPARK-17358] - Cached table(parquet/orc) should be shard between beelines
- [SPARK-17364] - Can not query hive table starting with number
- [SPARK-17369] - MetastoreRelation toJSON throws exception
- [SPARK-17370] - Shuffle service files not invalidated when a slave is lost
- [SPARK-17376] - Spark version should be available in R
- [SPARK-17391] - Fix Two Test Failures After Backport
- [SPARK-17396] - Threads number keep increasing when query on external CSV partitioned table
- [SPARK-17418] - Spark release must NOT distribute Kinesis related assembly artifact
- [SPARK-17438] - Master UI should show the correct core limit when `ApplicationInfo.executorLimit` is set
- [SPARK-17439] - QuantilesSummaries returns the wrong result after compression
- [SPARK-17442] - Additional arguments in write.df are not passed to data source
- [SPARK-17463] - Serialization of accumulators in heartbeats is not thread-safe
- [SPARK-17465] - Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may lead to memory leak
- [SPARK-17474] - Python UDF does not work between Sort and Limit
- [SPARK-17491] - MemoryStore.putIteratorAsBytes() may silently lose values when KryoSerializer is used
- [SPARK-17494] - Floor/ceil of decimal returns wrong result if it's in compact format
- [SPARK-17502] - Multiple Bugs in DDL Statements on Temporary Views
- [SPARK-17503] - Memory leak in Memory store when unable to cache the whole RDD in memory
- [SPARK-17511] - Dynamic allocation race condition: Containers getting marked failed while releasing
- [SPARK-17512] - Specifying remote files for Python based Spark jobs in Yarn cluster mode not working
- [SPARK-17514] - df.take(1) and df.limit(1).collect() perform differently in Python
- [SPARK-17515] - CollectLimit.execute() should perform per-partition limits
- [SPARK-17521] - Error when I use sparkContext.makeRDD(Seq())
- [SPARK-17525] - SparkContext.clearFiles() still present in the PySpark bindings though the underlying Scala method was removed in Spark 2.0
- [SPARK-17531] - Don't initialize Hive Listeners for the Execution Client
- [SPARK-17541] - fix some DDL bugs about table management when same-name temp view exists
- [SPARK-17545] - Spark SQL Catalyst doesn't handle ISO 8601 date without colon in offset
- [SPARK-17546] - start-* scripts should use hostname -f
- [SPARK-17547] - Temporary shuffle data files may be leaked following exception in write
- [SPARK-17548] - Word2VecModel.findSynonyms can spuriously reject the best match when invoked with a vector
- [SPARK-17567] - Broken link to Spark paper
- [SPARK-17571] - AssertOnQuery.condition should be consistent in requiring Boolean return type
- [SPARK-17599] - Folder deletion after globbing may fail StructuredStreaming jobs
- [SPARK-17613] - PartitioningAwareFileCatalog.allFiles doesn't handle URI specified path at parent
- [SPARK-17616] - Getting "java.lang.RuntimeException: Distinct columns cannot exist in Aggregate "
- [SPARK-17617] - Remainder(%) expression.eval returns incorrect result
- [SPARK-17618] - Dataframe except returns incorrect results when combined with coalesce
- [SPARK-17627] - Streaming Providers should be labeled Experimental
- [SPARK-17641] - collect_set should ignore null values
- [SPARK-17644] - The failed stage never resubmitted due to abort stage in another thread
- [SPARK-17650] - Adding a malformed URL to sc.addJar and/or sc.addFile bricks Executors
- [SPARK-17652] - Fix confusing exception message while reserving capacity
- [SPARK-17666] - take() or isEmpty() on dataset leaks s3a connections
- [SPARK-17672] - Spark 2.0 history server web Ui takes too long for a single application
- [SPARK-17673] - Reused Exchange Aggregations Produce Incorrect Results
- [SPARK-17752] - Spark returns incorrect result when 'collect()'ing a cached Dataset with many columns
- [SPARK-17809] - scala.MatchError: BooleanType when casting a struct
Documentation(文档)
- [SPARK-16295] - Extract SQL programming guide example snippets from source files instead of hard code them
- [SPARK-16761] - Fix doc link in docs/ml-guide.md
- [SPARK-16911] - Remove migrating to a Spark 1.x version in programming guide documentation
- [SPARK-17085] - Documentation and actual code differs - Unsupported Operations
- [SPARK-17089] - Remove link of api doc for mapReduceTriplets because its removed from api.
- [SPARK-17242] - Update links of external dstream projects
- [SPARK-17561] - DataFrameWriter documentation formatting problems
- [SPARK-17575] - Make correction in configuration documentation table tags
Improvement(改动)
- [SPARK-2424] - ApplicationState.MAX_NUM_RETRY should be configurable
- [SPARK-10835] - Word2Vec should accept non-null string array, in addition to existing null string array
- [SPARK-12370] - Documentation should link to examples from its own release version
- [SPARK-13286] - JDBC driver doesn't report full exception
- [SPARK-15639] - Try to push down filter at RowGroups level for parquet reader
- [SPARK-15703] - Make ListenerBus event queue size configurable
- [SPARK-15923] - Spark Application rest api returns "no such app: <appId>"
- [SPARK-16216] - CSV data source does not write date and timestamp correctly
- [SPARK-16240] - model loading backward compatibility for ml.clustering.LDA
- [SPARK-16320] - Document G1 heap region's effect on spark 2.0 vs 1.6
- [SPARK-16324] - regexp_extract should doc that it returns empty string when match fails
- [SPARK-16568] - update sql programing guide refreshTable API
- [SPARK-16650] - Improve documentation of spark.task.maxFailures
- [SPARK-16651] - Document no exception using DataFrame.withColumnRenamed when existing column doesn't exist
- [SPARK-16663] - desc table should be consistent between data source and hive serde tables
- [SPARK-16764] - Recommend disabling vectorized parquet reader on OutOfMemoryError
- [SPARK-16772] - Correct API doc references to PySpark classes + formatting fixes
- [SPARK-16796] - Visible passwords on Spark environment page
- [SPARK-16805] - Log timezone when query result does not match
- [SPARK-16812] - Open up SparkILoop.getAddedJars
- [SPARK-16813] - Remove private[sql] and private[spark] from catalyst package
- [SPARK-16870] - add "spark.sql.broadcastTimeout" into docs/sql-programming-guide.md to help people to how to fix this timeout error when it happenned
- [SPARK-16875] - Add args checking for DataSet randomSplit and sample
- [SPARK-16877] - Add a rule for preventing use Java's Override annotation
- [SPARK-16932] - Programming-guide Accumulator section should be more clear w.r.t new API
- [SPARK-16935] - Verification of Function-related ExternalCatalog APIs
- [SPARK-16947] - Support type coercion and foldable expression for inline tables
- [SPARK-16964] - Remove private[sql] and private[spark] from sql.execution package
- [SPARK-17023] - Update Kafka connetor to use Kafka 0.10.0.1
- [SPARK-17063] - MSCK REPAIR TABLE is super slow with Hive metastore
- [SPARK-17084] - Rename ParserUtils.assert to validate
- [SPARK-17186] - remove catalog table type INDEX
- [SPARK-17193] - HadoopRDD NPE at DEBUG log level when getLocationInfo == null
- [SPARK-17231] - Avoid building debug or trace log messages unless the respective log level is enabled
- [SPARK-17246] - Support BigDecimal literal parsing
- [SPARK-17279] - better error message for exceptions during ScalaUDF execution
- [SPARK-17297] - Clarify window/slide duration as absolute time, not relative to a calendar
- [SPARK-17301] - Remove unused classTag field from AtomicType base class
- [SPARK-17316] - Don't block StandaloneSchedulerBackend.executorRemoved
- [SPARK-17347] - Encoder in Dataset example has incorrect type
- [SPARK-17378] - Upgrade snappy-java to 1.1.2.6
- [SPARK-17421] - Document warnings about "MaxPermSize" parameter when building with Maven and Java 8
- [SPARK-17445] - Reference an ASF page as the main place to find third-party packages
- [SPARK-17480] - CompressibleColumnBuilder inefficiently call gatherCompressibilityStats
- [SPARK-17483] - Minor refactoring and cleanup in BlockManager block status reporting and block removal
- [SPARK-17484] - Race condition when cancelling a job during a cache write can lead to block fetch failures
- [SPARK-17485] - Failed remote cached block reads can lead to whole job failure
- [SPARK-17486] - Remove unused TaskMetricsUIData.updatedBlockStatuses field
- [SPARK-17558] - Bump Hadoop 2.7 version from 2.7.2 to 2.7.3
- [SPARK-17569] - Don't recheck existence of files when generating File Relation resolution in StructuredStreaming
- [SPARK-17577] - SparkR support add files to Spark job and get by executors
- [SPARK-17609] - SessionCatalog.tableExists should not check temp view
- [SPARK-17638] - Stop JVM StreamingContext when the Python process is dead
- [SPARK-17640] - Avoid using -1 as the default batchId for FileStreamSource.FileEntry
- [SPARK-17649] - Log how many Spark events got dropped in LiveListenerBus
- [SPARK-17651] - Automate Spark version update for documentations
- [SPARK-18391] - Openstack deployment scenarios
New Feature(新特征)
- [SPARK-16956] - Make ApplicationState.MAX_NUM_RETRY configurable
- [SPARK-17069] - Expose spark.range() as table-valued function in SQL
- [SPARK-17150] - Support SQL generation for inline tables
- [SPARK-17456] - Utility for parsing Spark versions
Question
- [SPARK-17794] - 2.0.1 not in maven central repo?
- maven引入方式:
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.0.1</version> </dependency>
地址: https://repo1.maven.org/maven2/org/apache/spark/spark-core_2.11/