不知不觉,距离上次发帖已经过去7个月了,时间真是太快了。
上次刚刚发完Knime升级到2.11的帖子,现在发现Knime在9月30日升级到2.22了。先把官方的变动贴出来,再说下自己的看法。
Changes from v2.1.1 to v2.1.2
Enhancements
- Enh 2178: New Node: Perl node to be published on /labs
- Enh 2109: Unpivoting node option needs option "Ignore missing values"
- Enh 2132: Add (unsupported) java property to disable table stream compression
- Enh 2158: BatchExecutor needs to support "long" options
- Enh 2172: HiliteHandler to fire events asynchronously (needed for Spotfire node)
Bug Fixes
- Bug 2084: Concatenate node: hilte translation does not work with duplicate keys
- Bug 2097: Conditional Boxplot ignores domain
- Bug 2108: Pivot Node Dialog with wrong enabled/disabled combo boxes
- Bug 2112: DatabaseConnectionSettings driver file not found without DEBUG stacktrace
- Bug 2114: Database Writer can't append DoubleCell (NUMBER) columns
- Bug 2116: Domain Calculator drops domains of unselected columns
- Bug 2119: Scatter matrix displays wrong labels on y-axis
- Bug 2120: Local temp tables not cleared when node is reset (using Sub-ExecutionContext)
- Bug 2121: race condition when opening Buffer results in endless loop
- Bug 2126: k-means node uses wrong update delta indices in case of missing values.
- Bug 2128: Java Snippet throws error when flow variable is access multiple times in the script
- Bug 2131: Web service node throws NPE in presence of (certain) array elements in schema
- Bug 2143: Batch executor does not fully execute (race condition)
- Bug 2144: SVM Learner chokes on invalid input (either no or inappropriate columns)
- Bug 2146: Project icon goes blank when workflow fails (Linux)
- Bug 2148: Naive Bayes missing value/laplace corrector problem
- Bug 2152: PNN and Fuzzy Predictor configure fails with duplicate columns
- Bug 2154: RProp, DecTree, k-means PMML learner fail when input contains 3rd party column types
- Bug 2157: PCA Apply does not accept percentage criterion (dialog setting ignored)
- Bug 2161: libsvm extension column order (wrong prediction when test spec does not match train spec)
- Bug 2162: Scorer with switched FP / FN values in Accuracy Table
- Bug 2171: Database Timeout not correctly initialized, always 0
- Bug 2173: FileReader doesn't recognize missValue pattern with skipped cols
Bug Fixes (Reporting)
- Bug 2113: Too many rows in table report elements when creating final report
- Bug 2122: Exception when generating report after closing workflow
- Bug 2147: Reporting extension blocks Manifest editor in KNIME SDK (ClassCastException)
Changes from v2.1.2 to v2.2.0
Major new features
- Enh: Workflow example server "publicserver.knime.org"
- Enh 2263: Add flow variables port to each node (enables control on execution order)
- Enh: Allow for optional inputs
- Enh 2254: Support for workflow-local files (separate storage in workflow folder)
- Enh 2287: Loop-Concept for Chunking (implement streaming like approach)
- Enh 2329: New Item Set Miner Feature
- Enh 2347: Modular Data Generation plug-in (new set of nodes)
- Enh 2349: Web Analytics plug-in for (new set of nodes)
Minor new features
- Bug 2258: Introduce credentials store on workflow, replacing current master key concept
- Bug 2318: Define extension point for aggregation method
- Bug 2184: Extension point for distance functions
- Bug 2182: PMML 3.2 support for PMMLDecisionTreeHandler, PMMLGeneralRegressionContentHandler and PMMLNeuralNetworkHandler
- Bug 2211: K-Means node to use "include all columns" checker
- Bug 1963: ColumnFilterPanel needs adjustable column filter
- Bug 2213: KNIME Desktop Version to include reporting update site
New Nodes
- Bug 1768: Logistic Regression Learner & Applier
- Bug 2289: New Joiner Implementation (more flexible matching criteria, scalability, composite keys,...)
- Bug 2290: CSV Reader (more flexible than File Reader node when input structure changes)
- Bug 1957: Ungroup node (Split Collection in Rows)
- Bug 2304: Loop End node with two in/outputs
Bug fixes
- Bug 2281: BatchExecutor executes much slower than GUI
- Bug 2214: Sorter throws OutOfMemory for 100k x 2k dimensional table (with numbers only)
- Bug 2186: DateAndTimeCell returns wrong string value
- Bug 2188: Perl Array Return Problem
- Bug 2189: Can't use collection cells in domain information (serialization problem)
- Bug 2193: R .RData files must not be created (problems with multiple processes)
- Bug 2194: PMML Importer should fail when importing multiple models
- Bug 2197: NPE in SDFWriter when some fields are missing
- Bug 2198: Sdf Reader does not close input streams when checking file existence
- Bug 2199: File Reader does not close input streams when checking file existence (see bug 2198)
- Bug 2201: Distance Matrix reader does not close input streams when checking file existence (see bug 2198)
- Bug 2215: DecTree predictor produces wrong output in case of missing values
- Bug 2216: missing clone in InMemoryTable screws Decision Tree
- Bug 2217: BatchExecutor doesn't execute projects with SGE job manager
- Bug 2221: Typo in "Import Workflow..." dialog
- Bug 2223: scaling problem in default dialog components (e.g. column filter) when having many (100k) columns
- Bug 2225: Source nodes in meta nodes running in loops let the loop block (stay UNCONFIGURED_MARKFOREXEC)
- Bug 2231: DistanceMatrixReader slow with full matrices
- Bug 2283: Association Rule Learning missing options in node description
- Bug 2284: R nodes generate tons of tmp-files
- Bug 2291: ConcurrentModificationException in StringHistory
- Bug 2306: WorkflowManager needs to have method "waitWhileInExecution"
- Bug 2345: Wrong project icon in project navigator when workflow has meta-infos
- Bug 2348: Domain calculator does not drop domain when no columns in include list
- Bug 1355: DialogComponentButtonGroup layout problem for long border titles
- Bug 1991: Progress bar at wrong position in splash screen
- Bug 2060: Extending the group by functionality to handle DateAndTimeCells
- Bug 2061: GroupByNode should be extendable via AggregationMethod registration
- Bug 2062: GroupBy node should use other mean calculation algorithm to prevent buffer overflow
- Bug 2206: SVM PMML Ex/Import logs (unimportant?) warning messages
- Bug 2277: State change listener not removed in WorkflowManager#executeAllAndWaitUntilDone
- Bug 2325: Node directory in meta nodes not deleted when node is removed
- Bug 2326: Weka node dialogs need ScrollPane
- Bug 2330: Association Rule Viewer supporting new input
- Bug 2332: PMML Spec Creator resets learn/target columns upon createSpec()
- Bug 2333: k-Means to fail on missing values (has currently normalization problems, see bug 2127)
- Bug 2334: DistMatrixPlotterNodeFactory from plugin 'org.knime.distmatrix' could not be created
- Bug 2338: Reset on meta node does not configure contained source nodes.
- Bug 2339: Naive Bayes predictor produces number overflow
- Bug 2340: Aggregation column panel in histogram node shouldn't be editable
- Bug 2355: Decision Tree Predictor fails with null pointer exception on empty class distributions
- Bug 2080: GroupBy sorts without a reason (way too slow!!!)
- Bug 2135: FlowVariable stack gets large in loops (contains duplicate items) and causes huge settings.xml files
Changes from v2.2.0 to v2.2.2
Enhancements
- Enh 2383: report designer view to show "Scripts" and "XML Source" tabs in expert mode (for next bug fix release)
- Enh 2403: Excel reader to read *xlsx files (2007 & 2010)
- Enh 2391: Spotfire node to be published on labs
Bug fixes
- Bug 2326: Weka node dialogs need ScrollPane
- Bug 2346: R2PMML node fails with incompatible PMML models
- Bug 2360: PDF and HTML Report nodes: init idle, but configure after load
- Bug 2362: Model Reader Node throws NPE if fed with non-zip file
- Bug 2363: Database connection: handle null passwords as empty
- Bug 2365: Table Reader node to accept URLs (currently requires absolute files)
- Bug 2370: Joiner node has confusing node description (no dialog control "Buffer Size")
- Bug 2371: PCA nodes have problems with columns containing only missing values
- Bug 2372: Newly created nodes remain "dirty" following save (saved twice)
- Bug 2373: Feature Elimination Start node fails with IAE in ColumnRearranger
- Bug 2374: ROC plot shows 1.01 as highest value
- Bug 2377: Credentials option not available in Node Description
- Bug 2378: DatabaseLooping nodes does not support Credentials
- Bug 2379: Using Credentials breaks database Preview tab
- Bug 2388: Credentials Dialog should show workflow name
- Bug 2389: Node label and description centered for small bounding boxes
- Bug 2402: Excel Reader does not read from URL (required for node drop files)
- Bug 2404: Decision Tree Predictor View Pie Chart Problem
- Bug 2405: CSV Reader has problems converting URL to file path (spaces still %20)
- Bug 2406: Decision Tree Learner Learner showing missing value colors
- Bug 2410: Subset matcher does not find all matching subsets
- Bug 2414: GroupBy node way too slow if there are many irrelevant columns
- Bug 2415: Variable Based File requires original input file to work
- Bug 2416: DB Reader Node resets even if settings did not change
- Bug 2428: Omitting predicates on loading PMML compound predicates
下面是我自己的感受,我是从2.11升级到2.22的。
1、最高兴的是输出XLS文件的节点总于能够正确转换中文了,之前用该节点输出的中文全部为乱码,现在总于可以把输出CSV文件的节点抛弃了。
2、很多节点增加了读取/保存当前设置的选项。
3、增加了一些新的节点。
4、RowID节点增强了功能,包括如何处理重复值和填充MissingValue等。