GnuCash  5.6-150-g038405b370+
Public Member Functions | Data Fields
GncTxImport Class Reference

The actual TxImport class It's intended to use in the following sequence of actions: More...

#include <gnc-import-tx.hpp>

Public Member Functions

 GncTxImport (GncImpFileFormat format=GncImpFileFormat::UNKNOWN)
 Constructor for GncTxImport.
 
 ~GncTxImport ()
 Destructor for GncTxImport.
 
void file_format (GncImpFileFormat format)
 Sets the file format for the file to import, which may cause the file to be reloaded as well if the previously set file format was different and a filename was already set. More...
 
GncImpFileFormat file_format ()
 
void multi_split (bool multi_split)
 Toggles the multi-split state of the importer and will subsequently sanitize the column_types list. More...
 
bool multi_split ()
 
void base_account (Account *base_account)
 Sets a base account. More...
 
Accountbase_account ()
 
void currency_format (int currency_format)
 
int currency_format ()
 
void date_format (int date_format)
 
int date_format ()
 
void encoding (const std::string &encoding)
 Converts raw file data using a new encoding. More...
 
std::string encoding ()
 
void update_skipped_lines (std::optional< uint32_t > start, std::optional< uint32_t > end, std::optional< bool > alt, std::optional< bool > errors)
 
uint32_t skip_start_lines ()
 
uint32_t skip_end_lines ()
 
bool skip_alt_lines ()
 
bool skip_err_lines ()
 
void separators (std::string separators)
 
std::string separators ()
 
void settings (const CsvTransImpSettings &settings)
 
bool save_settings ()
 
void settings_name (std::string name)
 
std::string settings_name ()
 
void load_file (const std::string &filename)
 Loads a file into a GncTxImport. More...
 
void tokenize (bool guessColTypes)
 Splits a file into cells. More...
 
std::string verify (bool with_acct_errors)
 
void create_transactions ()
 This function will attempt to convert all tokenized lines into transactions using the column types the user has set. More...
 
bool check_for_column_type (GncTransPropType type)
 
void set_column_type (uint32_t position, GncTransPropType type, bool force=false)
 
std::vector< GncTransPropType > column_types ()
 
std::set< std::string > accounts ()
 

Data Fields

std::unique_ptr< GncTokenizerm_tokenizer
 Will handle file loading/encoding conversion/splitting into fields.
 
std::vector< parse_line_tm_parsed_lines
 source file parsed into a two-dimensional array of strings. More...
 
std::multimap< time64, std::shared_ptr< DraftTransaction > > m_transactions
 map of transaction objects created from parsed_lines and column_types, ordered by date
 

Detailed Description

The actual TxImport class It's intended to use in the following sequence of actions:

Definition at line 104 of file gnc-import-tx.hpp.

Member Function Documentation

◆ base_account()

void GncTxImport::base_account ( Account base_account)

Sets a base account.

This is the account all import data relates to. As such at least one split of each transaction that will be generated will be in this account. When a base account is set, there can't be an account column selected in the import data. In multi-split mode the user has to select an account column so in that mode the base_account can't be set.

Parameters
base_accountPointer to an account or NULL.

Definition at line 177 of file gnc-import-tx.cpp.

178 {
179  if (m_settings.m_multi_split)
180  {
181  m_settings.m_base_account = nullptr;
182  return;
183  }
184 
185  auto base_account_is_new = m_settings.m_base_account == nullptr;
186  m_settings.m_base_account = base_account;
187 
188  if (m_settings.m_base_account)
189  {
190  auto col_type_it = std::find (m_settings.m_column_types.begin(),
191  m_settings.m_column_types.end(), GncTransPropType::ACCOUNT);
192  if (col_type_it != m_settings.m_column_types.end())
193  set_column_type(col_type_it - m_settings.m_column_types.begin(),
194  GncTransPropType::NONE);
195 
196  if (base_account_is_new)
197  {
198  /* Set default account for each line's split properties */
199  for (auto line : m_parsed_lines)
200  std::get<PL_PRESPLIT>(line)->set_account (m_settings.m_base_account);
201  }
202  else
203  {
204  /* Reparse all of the lines with the new base account's commodity. */
205  tokenize(false);
206  }
207 
208  }
209 }
void tokenize(bool guessColTypes)
Splits a file into cells.
std::vector< parse_line_t > m_parsed_lines
source file parsed into a two-dimensional array of strings.

◆ create_transactions()

void GncTxImport::create_transactions ( )

This function will attempt to convert all tokenized lines into transactions using the column types the user has set.

Creates a list of transactions from parsed data.

The parsed data will first be validated. If any errors are found in lines that are marked for processing (ie not marked to skip) this function will throw an error.

Parameters
skip_errorstrue skip over lines with errors
Exceptions
throwsstd::invalid_argument if data validation or processing fails.

Definition at line 724 of file gnc-import-tx.cpp.

725 {
726  /* Start with verifying the current data. */
727  auto verify_result = verify (true);
728  if (!verify_result.empty())
729  throw std::invalid_argument (verify_result);
730 
731  /* Drop all existing draft transactions */
732  m_transactions.clear();
733 
734  m_parent = nullptr;
735 
736  /* Iterate over all parsed lines */
737  for (auto parsed_lines_it = m_parsed_lines.begin();
738  parsed_lines_it != m_parsed_lines.end();
739  ++parsed_lines_it)
740  {
741  /* Skip current line if the user specified so */
742  if ((std::get<PL_SKIP>(*parsed_lines_it)))
743  continue;
744 
745  /* Should not throw anymore, otherwise verify needs revision */
746  create_transaction (parsed_lines_it);
747  }
748 }
std::multimap< time64, std::shared_ptr< DraftTransaction > > m_transactions
map of transaction objects created from parsed_lines and column_types, ordered by date ...
std::vector< parse_line_t > m_parsed_lines
source file parsed into a two-dimensional array of strings.

◆ encoding()

void GncTxImport::encoding ( const std::string &  encoding)

Converts raw file data using a new encoding.

This function must be called after load_file only if load_file guessed the wrong encoding.

Parameters
encodingEncoding that data should be translated using

Definition at line 256 of file gnc-import-tx.cpp.

257 {
258 
259  // TODO investigate if we can catch conversion errors and report them
260  if (m_tokenizer)
261  {
262  m_tokenizer->encoding(encoding); // May throw
263  try
264  {
265  tokenize(false);
266  }
267  catch (...)
268  { };
269  }
270 
271  m_settings.m_encoding = encoding;
272 }
void encoding(const std::string &encoding)
Converts raw file data using a new encoding.
std::unique_ptr< GncTokenizer > m_tokenizer
Will handle file loading/encoding conversion/splitting into fields.
void tokenize(bool guessColTypes)
Splits a file into cells.

◆ file_format()

void GncTxImport::file_format ( GncImpFileFormat  format)

Sets the file format for the file to import, which may cause the file to be reloaded as well if the previously set file format was different and a filename was already set.

Parameters
formatthe new format to set
Exceptions
std::ifstream::failureif file reloading fails

Definition at line 87 of file gnc-import-tx.cpp.

88 {
89  if (m_tokenizer && m_settings.m_file_format == format)
90  return;
91 
92  auto new_encoding = std::string("UTF-8");
93  auto new_imp_file = std::string();
94 
95  // Recover common settings from old tokenizer
96  if (m_tokenizer)
97  {
98  new_encoding = m_tokenizer->encoding();
99  new_imp_file = m_tokenizer->current_file();
100  if (file_format() == GncImpFileFormat::FIXED_WIDTH)
101  {
102  auto fwtok = dynamic_cast<GncFwTokenizer*>(m_tokenizer.get());
103  if (!fwtok->get_columns().empty())
104  m_settings.m_column_widths = fwtok->get_columns();
105  }
106  }
107 
108  m_settings.m_file_format = format;
109  m_tokenizer = gnc_tokenizer_factory(m_settings.m_file_format);
110 
111  // Set up new tokenizer with common settings
112  // recovered from old tokenizer
113  m_tokenizer->encoding(new_encoding);
114  load_file(new_imp_file);
115 
116  // Restore potentially previously set separators or column_widths
117  if ((file_format() == GncImpFileFormat::CSV)
118  && !m_settings.m_separators.empty())
119  separators (m_settings.m_separators);
120  else if ((file_format() == GncImpFileFormat::FIXED_WIDTH)
121  && !m_settings.m_column_widths.empty())
122  {
123  auto fwtok = dynamic_cast<GncFwTokenizer*>(m_tokenizer.get());
124  fwtok->columns (m_settings.m_column_widths);
125  }
126 
127 }
void load_file(const std::string &filename)
Loads a file into a GncTxImport.
std::unique_ptr< GncTokenizer > m_tokenizer
Will handle file loading/encoding conversion/splitting into fields.
void file_format(GncImpFileFormat format)
Sets the file format for the file to import, which may cause the file to be reloaded as well if the p...

◆ load_file()

void GncTxImport::load_file ( const std::string &  filename)

Loads a file into a GncTxImport.

This is the first function that must be called after creating a new GncTxImport. As long as this function didn't run successfully, the importer can't proceed.

Parameters
filenameName of the file that should be opened
Exceptions
maythrow std::ifstream::failure on any io error

Definition at line 377 of file gnc-import-tx.cpp.

378 {
379 
380  /* Get the raw data first and handle an error if one occurs. */
381  try
382  {
383  m_tokenizer->load_file (filename);
384  return;
385  }
386  catch (std::ifstream::failure& ios_err)
387  {
388  // Just log the error and pass it on the call stack for proper handling
389  PWARN ("Error: %s", ios_err.what());
390  throw;
391  }
392 }
#define PWARN(format, args...)
Log a warning.
Definition: qoflog.h:250
std::unique_ptr< GncTokenizer > m_tokenizer
Will handle file loading/encoding conversion/splitting into fields.

◆ multi_split()

void GncTxImport::multi_split ( bool  multi_split)

Toggles the multi-split state of the importer and will subsequently sanitize the column_types list.

All types that don't make sense in the new state are reset to type GncTransPropType::NONE. Additionally the interpretation of the columns with transaction properties changes when changing multi-split mode. So this function will force a reparsing of the transaction properties (if there are any) by resetting the first column with a transaction property it encounters.

Parameters
multi_splitBoolean value with desired state (multi-split vs two-split).

Definition at line 145 of file gnc-import-tx.cpp.

146 {
147  auto trans_prop_seen = false;
148  m_settings.m_multi_split = multi_split;
149  for (uint32_t i = 0; i < m_settings.m_column_types.size(); i++)
150  {
151  auto old_prop = m_settings.m_column_types[i];
152  auto is_trans_prop = ((old_prop > GncTransPropType::NONE)
153  && (old_prop <= GncTransPropType::TRANS_PROPS));
154  auto san_prop = sanitize_trans_prop (old_prop, m_settings.m_multi_split);
155  if (san_prop != old_prop)
156  set_column_type (i, san_prop);
157  else if (is_trans_prop && !trans_prop_seen)
158  set_column_type (i, old_prop, true);
159  trans_prop_seen |= is_trans_prop;
160 
161  }
162  if (m_settings.m_multi_split)
163  m_settings.m_base_account = nullptr;
164 }
void multi_split(bool multi_split)
Toggles the multi-split state of the importer and will subsequently sanitize the column_types list...

◆ tokenize()

void GncTxImport::tokenize ( bool  guessColTypes)

Splits a file into cells.

This requires having an encoding that works (see GncTxImport::convert_encoding). Tokenizing related options should be set to the user's selections before calling this function. Notes: - this function must be called with guessColTypes set to true once before calling it with guessColTypes set to false.

  • if guessColTypes is true, all the column types will be set GncTransPropType::NONE right now as real guessing isn't implemented yet
    Parameters
    guessColTypestrue to guess what the types of columns are based on the cell contents
    Exceptions
    std::range_errorif tokenizing failed

Definition at line 405 of file gnc-import-tx.cpp.

406 {
407  if (!m_tokenizer)
408  return;
409 
410  uint32_t max_cols = 0;
411  m_tokenizer->tokenize();
412  m_parsed_lines.clear();
413  for (auto tokenized_line : m_tokenizer->get_tokens())
414  {
415  auto length = tokenized_line.size();
416  if (length > 0)
417  {
418  auto pretrans = std::make_shared<GncPreTrans>(date_format(), m_settings.m_multi_split);
419  auto presplit = std::make_shared<GncPreSplit>(date_format(), currency_format());
420  presplit->set_pre_trans (std::move (pretrans));
421  m_parsed_lines.push_back (std::make_tuple (tokenized_line, ErrMap(),
422  presplit->get_pre_trans(), std::move (presplit), false));
423  }
424  if (length > max_cols)
425  max_cols = length;
426  }
427 
428  /* If it failed, generate an error. */
429  if (m_parsed_lines.size() == 0)
430  {
431  throw (std::range_error (N_("There was an error parsing the file.")));
432  return;
433  }
434 
435  m_settings.m_column_types.resize(max_cols, GncTransPropType::NONE);
436 
437  /* Force reinterpretation of already set columns and/or base_account */
438  for (uint32_t i = 0; i < m_settings.m_column_types.size(); i++)
439  set_column_type (i, m_settings.m_column_types[i], true);
440  if (m_settings.m_base_account)
441  {
442  for (auto line : m_parsed_lines)
443  std::get<PL_PRESPLIT>(line)->set_account (m_settings.m_base_account);
444  }
445 
446  if (guessColTypes)
447  {
448  /* Guess column_types based
449  * on the contents of each column. */
450  /* TODO Make it actually guess. */
451  }
452 }
std::unique_ptr< GncTokenizer > m_tokenizer
Will handle file loading/encoding conversion/splitting into fields.
std::vector< parse_line_t > m_parsed_lines
source file parsed into a two-dimensional array of strings.

Field Documentation

◆ m_parsed_lines

std::vector<parse_line_t> GncTxImport::m_parsed_lines

source file parsed into a two-dimensional array of strings.

Per line also holds possible error messages and objects with extracted transaction and split properties.

Definition at line 163 of file gnc-import-tx.hpp.


The documentation for this class was generated from the following files: