User Tools

Site Tools


import

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

import [2012/06/12 13:34]
Oleksiy [CSV/XLS/XLSX Import]
import [2017/06/02 09:31]
Line 1: Line 1:
-====== Import ====== 
- 
-The import module reads data from data sources. 
-  * CSV/XLS/XLSX Import 
-  * ODBC/OLEDB Import 
-   
- 
-==== CSV/XLS/XLSX Import ==== 
- 
-The module has support for 
-  * Tables that store variables as columns (by default). 
-  * Tables that store variables as rows. 
-  * CSV, XLS and XLSX file formats. 
-  * Reading table headings from the the first row. 
-  * Reading row labels from a column that contains unique Timestamps or IDs. 
-  * Reading row labels composed from two or three separate columns: Year + Month, Year + Week, Year + Quarter, Year + Month + Day. 
-  * Detection of text (categorical) variables. (blue columns) 
-  * Detection of Date/time variables in the dataset.(green columns) 
-  * Reading records in backward order. 
-  * Missing values detection. 
- 
-:!: Limitations: \\ 
-  * The module reads data only from the first sheet of .xls or .xlsx file.\\ 
-  * The module can't read from password-protected files.   
- 
-Data files may consist of numeral, categorical (text) or date/time columns. There is a color-based indication of column types: 
- 
-  * Categorical (text) - blue. 
-  * Date/time - green. 
-  * Missing values - light grey. (if the cell in not empty) 
- 
-Column names are allowed but not required. 
-GMDH Shell does not provide tools for file editing but allows a user to keep data file opened for editing. It is not required to close the editor (Excel, Notepad, etc) before clicking at the Import or the Start button. You can modify data file using your editor, save changes and immediately start recalculation of results in GMDH Shell. 
- 
-=== Import dialog === 
- 
-  - Click on the Import button {{:img:button-24_import.png?height=16|Import button}} located in the toolbar.  
-  - Select one of your data files, press OK. Then the Import configuration dialog opens. 
-  - In the Import dialog set the importing parameters, press OK. 
-  - If your project folder doesn't contain project settings yet, the Template selection dialog appears next to the Import configuration dialog.  
-  - In the Template selection dialog choose a relevant [[Templates|Template]] and press OK. 
- 
-If your project folder contains several data files, the Import module makes all of them available in the Data manager. Selection of just one file points importer module to the whole directory. 
- 
-The file selected during the import procedure receives a special status ''Current'' in the Data manager. Only variables from the Current file can be used as model inputs or targets without filename prefixes, for example, ''var1'' instead of ''filename.var1'' required for other files. 
- 
-Current file path will appear in the title bar of GMDH Shell window: 
- 
-{{:img:_window_title_bar.png|}} 
- 
-==== Import configuration ==== 
- 
-{{:img:dialog_import_csv-xls-xlsx.png?width=513|Import dialog}} 
- 
-== Read column labels from the 1st row ==  
-Reads column names from the first row of data file(s). The number of elements in the first row is used for  detection of data table width. 
- 
-== Read row labels (ID, timestamp) from column N == 
-If you have unique data row identifiers, for example, calendar dates then you can tell the Importer in which column they are located and use them for visualization instead of default ID marks. For example, row labels serve as timestamps for time series charts. 
-In case of multiple data files, the row labels will be taken from the ''Current'' data file. 
- 
-Quite often datasets have date marks such as year and month or week located in separate columns. Then you can compose timestamps from several columns using the option **Compose ID from several columns**. 
-In this case all aggregated columns in the dataset must be neighboring and the option **Read row lables from column N** must point to the first of them. 
- 
-==CSV delimiter== 
- 
-Sets a delimiter type. Applicable to CSV files only. 
- 
-==Missing value mark== 
- 
-Import  module is responsible for detection of missing values. It replaces various cells that fall into missing value conditions with regular NULL values and thus allows [[Preprocess|Preprocessor]] module to handle missing values appropriately. 
- 
-==Consider text cells as missing== 
- 
-Replaces any non-numeric values with regular NULL values. 
- 
- 
-==== Data file examples ==== 
- 
-\\ 
- 
-{{:img:window_openoffice_dataset.png|}} 
- 
-\\ 
- 
-{{:img:window_notepad_dataset.png|}} 
- 
- 
- 
-==== ODBC/OLEDB Import ==== 
- 
-This preprocessor will be available soon. 
-  
- 
-~~UP~~ 
- 
- 
- 
  
import.txt ยท Last modified: 2021/06/01 03:27 (external edit)