The dataset of 272 cases, which amounts to over 351 thousand rows of data, was collected from both electronic medical records and the VIPER data management system, where all the perfusion-related data were collected. Demographic data were expressed as mean values with either a standard deviation (SD) for continuous variable data or as a percentage of the total for categorical data (Table
1). We compared: patient baseline characteristics with operative details obtained from electronic medical records, results of sensors in-line monitoring and pump flow rate measurements. An enormous amount of data and the diversity of the data pattern determined the use the Data Science calculation tools [
8]. According to Data Science nomenclature, our estimates are described as: structured data (e.g., laboratory results), semi-structured data (e.g., sensor data) and unstructured data (e.g., patient notes) [
9‐
11]. Additionally, using the Data Science capabilities we looked for adequate pump flow rates for which relevant GDP conditions, such as DO
2 > 280 ml/m
2/min, SvO
2 > 68% and MAP> 60 mmHg, were met. Our toolset was based in on Anaconda Distribution (Anaconda Inc.,
https://www.anaconda.com), Python 3.6 (Python Software Foundation,
https://www.python.org) and Jupyter Notebook (Jupyter Project,
https://jupyter.org). Data cleaning and analysis was performed using Pandas (Python Data Analysis Library), whereas visualization was done with Matplotlib (Matplotlib Development Team,
https://matplotlib.org) and Seaborn libraries (
https://seaborn.pydata.org). All those applications use Berkeley Software Distribution (BSD) type of license, which means they are free for distribution, modification, and private and commercial use and do not require any liabilities in return. All the data, both structured and unstructured, were firstly gathered in a Microsoft Excel (Microsoft office package,
https://products.office.com/pl-pl/excel) format with each patient bookmarked with a separate spreadsheet. From there it was imported into a Jupyter Notebook, where, using Python Pandas module, it was merged into one big dataset that used the time for its main index. It was then scrapped from duplications and empty rows, cleaned from unnecessary information and missing information was interpolated from nearby points. Those operations were necessary to create a data platform that was subsequently investigated by adding to it different set of constrains. Finally, visual analysis was made with the use of 2D and 3D plots.
Table 1Demographic characteristics of the study group
Age (years), mean ± SD | 62.5 ± 12.4 |
Gender male, n (%) | 203 (73.6) |
Body surface area (m2), mean ± SD | 1.95 ± 0.21 |
Body mass index, mean ± SD | 28.38 ± 4.78 |
Risk factors, n (%) |
Coronary artery disease | 190 (68.8) |
Hypertension | 188 (68.1) |
Diabetes mellitus | 79 (28.6) |
Hypercholesterolemia | 59 (21.4) |
COPD | 12 (4.3) |
ESRD | 3 (1.1) |
Type of surgery, n (%) |
CABG | 133 (48,9) |
Other single procedure | 95 (34,9) |
Double procedure | 43 (15,8) |
Triple procedure | 1 (0.4) |