Erfahrungen mit Victoria Metrics und Datenmigration aus InfluxDB v2

Schtallone · 19. Juni 2024 um 14:32

Hallo zusammen,

ich bin im Zuge vom Umstieg vom ioBroker zu HA auch von influxDB v2 auf Victoria Metrics umgestiegen.
Nun möchte ich gerne meine alten Daten aus influxDB v2 in VM migrieren. Jetzt ist meine allgemeine Frage, ob sich in diesem Forum jemand auskennt mit VM und einer Datenmigration?

Danke und Gruß

dp20eic · 19. Juni 2024 um 15:09

Moin,

Ja, das Internet, denn auf den Seiten von Virtoria Metrics gibt es auch

Du musst aber selber schauen, ob Du damit zurechtkommst, habe kein Victoria Metrics und will mir das auch nicht installieren.

How to migrate current data from InfluxDB to VictoriaMetrics[

Migrating data from other TSDBs to VictoriaMetrics is as simple as importing data via any of supported formats.

But migration from InfluxDB might get easier when using vmctl - VictoriaMetrics command-line tool. See more about migrating from InfluxDB v1.x versions. Migrating data from InfluxDB v2.x is not supported yet. But there is useful 3rd party solution for this.

Please note, that data migration is a backfilling process. So, please consider backfilling tips.

VG
Bernd

Schtallone · 19. Juni 2024 um 15:38

Ich kenne die Dokumentation im Netz, jedoch komme ich aus Mangel an Kenntnis nicht damit zurecht. Und andere Tutorials habe ich nicht gefunden.

dp20eic · 19. Juni 2024 um 15:47

Moin,

Na, steht ja mehr oder minder da, dass es aktuell keinen Weg von influxDB V.2 zu Victoria Metrics gibt, und man sich das 3rd Party Produkt anschauen soll.

Migrating data from InfluxDB (2.x)

Migrating data from InfluxDB v2.x is not supported yet (#32). You may find useful a 3rd party solution for this - GitHub - jonppe/influx_to_victoriametrics: Python InfluxDB to VictoriaMetrics exporter script.

./influx_export.py -h
usage: influx_export.py [-h] [--INFLUXDB_V2_ORG INFLUXDB_V2_ORG] [--INFLUXDB_V2_URL INFLUXDB_V2_URL] [--INFLUXDB_V2_TOKEN INFLUXDB_V2_TOKEN]
                        [--INFLUXDB_V2_SSL_CA_CERT INFLUXDB_V2_SSL_CA_CERT] [--INFLUXDB_V2_TIMEOUT INFLUXDB_V2_TIMEOUT] [--INFLUXDB_V2_VERIFY_SSL INFLUXDB_V2_VERIFY_SSL]
                        [--vm-addr VM_ADDR]
                        bucket

Script for exporting InfluxDB data into victoria metrics instance. InfluxDB settings can be defined on command line
or as environment variables (or in .env file  if python-dotenv is installed). 

InfluxDB related args described in https://github.com/influxdata/influxdb-client-python#via-environment-properties

positional arguments:
  bucket                InfluxDB source bucket

optional arguments:
  -h, --help            show this help message and exit
  --INFLUXDB_V2_ORG INFLUXDB_V2_ORG, -o INFLUXDB_V2_ORG
                        InfluxDB organization
  --INFLUXDB_V2_URL INFLUXDB_V2_URL, -u INFLUXDB_V2_URL
                        InfluxDB Server URL, e.g., http://localhost:8086
  --INFLUXDB_V2_TOKEN INFLUXDB_V2_TOKEN, -t INFLUXDB_V2_TOKEN
                        InfluxDB access token.
  --INFLUXDB_V2_SSL_CA_CERT INFLUXDB_V2_SSL_CA_CERT, -S INFLUXDB_V2_SSL_CA_CERT
                        Server SSL Cert
  --INFLUXDB_V2_TIMEOUT INFLUXDB_V2_TIMEOUT, -T INFLUXDB_V2_TIMEOUT
                        InfluxDB timeout
  --INFLUXDB_V2_VERIFY_SSL INFLUXDB_V2_VERIFY_SSL, -V INFLUXDB_V2_VERIFY_SSL
                        Verify SSL CERT.
  --vm-addr VM_ADDR, -a VM_ADDR
                        VictoriaMetrics server

Das Skript herunterladen und ausführen, mit den Parametern die da gebraucht werden.

VG
Bernd

Schtallone · 19. Juni 2024 um 16:55

Wie gesagt, ich kenne das Script, aber ich habe halt kein wirkliches Wissen über python Script. Sodass es eben nicht so einfach runterladen und ausführen ist.
Es fängt bei mir schon damit an, dass ich den Syntax mit den zusätzlichen Angaben nicht richtig hin bekomme.
Zudem bekomme ich das Script selbst nicht hausgeführt zb mit dem Argument -h als Dienst innerhalb HA.

dp20eic · 19. Juni 2024 um 17:33

Moin,

ich möchte Dir nicht auf die Füße treten, aber ich vermute Dir fehlen die Skills, um das zu lösen.

Python ist eine Skriptsprache, normalerweise wird sie so ausgeführt

# python /pfad/zur/datei/influx_export.py -h

Ich hab mir das gerade mal angeschaut, da gehört doch schon mehr dazu, selbst wenn Du das Skript heruntergeladen hast und Du Dir ein temporäres Python Environment erstellen kannst, in dem Du dann noch zusätzliche Python Libs, nachinstallieren musst, ist das eher was für erfahrene User, zudem ist das Skript 4 Jahre nicht mehr angefasst worden, ob das dann noch mehr Probleme mit den aktuellen Python Versionen hat, kann ich nicht sagen.

Also, Deine Idee, über CSV es zu versuchen, ist, glaube ich, da doch erst einmal der richtige Weg.

Sorry für die Verwirrung mit all den Posts.

Darf ich fragen, wieso Du zu Victoria Metrics gewechselt bist?

VG
Bernd

Schtallone · 19. Juni 2024 um 18:03

Ich bin daher gewechselt, da Flux als Sprache nicht mehr weiter entwickelt werden soll und dann war ich nach einer zukunftsorientierten Alternative zu InfluxDB auf der Suche. Die Sprache MetricsQL bzw.PromQL ist deutlich einfacher als zb Flux.
Keine gute Idee?

Edit:
Dies ist übrigens der Code des Scripts, wie ich es versucht habe auszuführen:

influxv2tovm.py -o influxdb2_org -u http://192.168.1.5:8086 -t xxxmytokenxxx -a http://homeassistant.local:8428 -n VM_meterReadings_1M

bzw. auch mit Pfad zum Script im HA:

pyscript/influxv2tovm.py -o influxdb2_org -u http://192.168.1.5:8086 -t xxxmytokenxxx -a http://homeassistant.local:8428 -n VM_meterReadings_1M

und hier aus der configuration.yaml der code für pyscript:

pyscript:
  allow_all_imports: true
  hass_is_global: true
  requirements:
  - pandas~=2.2.2
  - requests~=2.31.0
  - python-dotenv~=1.0.1
  - humanize~=4.9.0
  - urllib3~=1.26.18
  - influxdb-client>1.40.0

dp20eic · 20. Juni 2024 um 17:00

Moin,

Ah, aber das ist nur die halbe Wahrheit, siehe hier,

Es wird noch gepflegt, und influxQL und die neue Sprache sql, die sich näher am SQL Standard anlehnt, das kommt aber erst für influxDB V3., die es aktuell nur in der Cloud gibt und ob es eine Open Source geben wird muss sich erst noch zeigen.

War das eine Frage an mich?
Ich habe mich von influxQL zu FLUX durchgearbeitet, ich komme damit eigentlich ganz gut zurecht, ich klicke mir die Abfragen in der influxDB V. 2 WEB-GUI zusammen, um sie zu kopieren und in Grafana dann anzuwenden.

Und welche Ausgaben komme, ich kann das ja nicht nachvollziehen.

Das Skript, welches Du nennst influxv2tovm.py kenne ich nicht, wo ist das her?
Hast Du das So genannt?
Wenn ja, dann entweder das Skript noch ausführbar machen oder mit Python vorweg aufrufen.

# python3 influxv2tovm.py ...

Dann sind das Key = Value, die mit Sicherheit bei Dir anders heißen als das, was Du aufrufst.

-o = ist die Organisation, die Du eingerichtet hast, heißt die wirklich influxdb2_org wie in der Dokumentation

Sind denn auch alle Requirments, erfüllt?

VG
Bernd

Schtallone · 20. Juni 2024 um 17:13

Hallo,

das Script habe ich hier her und es ist eine Abwandlung des originalen Scripts, welches u.A. auch “Dry-Runs” unterstützt.

https://github.com/frli4797/influxv2tovm

Ich versuche das Script in den Diensten im HA so auszuführen:

service: pyscript.influxv2tovm.py -o influxdb2_org -u http://192.168.1.5:8086 -t  xxmytokenxxx -a http://homeassistant.local:8428 -n VM_meterReadings_1M
data: {}

Das ist dann die Fehlermeldung:

Fehler beim Aufrufen des Diensts pyscript.influxv2tovm.py -o influxdb2_org -u http://192.168.1.5:8086 -t K3s0zDvj1l5y_hVysvyNRocjZay9bGCHxkfaOJinPLnEisiIj4U1sjhN4p4NDFYupQ2A4XZByvAIBSY4_a7Fcg== -a http://homeassistant.local:8428 -n VM_meterReadings_1M. Service pyscript.influxv2tovm.py -o influxdb2_org -u http://192.168.1.5:8086 -t xxxmytokenxxx -a http://homeassistant.local:8428 -n VM_meterReadings_1M does not match format <domain>.<name> for dictionary value @ data['sequence'][0]['service']. Got 'pyscript.influxv2tovm.py -o influxdb2_org -u http://192.168.1.5:8086 -t K3s0zDvj1l5y_hVysvyNRocjZay9bGCHxkfaOJinPLnEisiIj4U1sjhN4p4NDFYupQ2A4XZByvAIBSY4_a7Fcg== -a http://homeassistant.local:8428 -n VM_meterReadings_1M'

Die Attribute passen alle, meine Organisation heißt tatsächlich so im Default…

Was meinst Du genau mit “Script” noch ausführbar machen? Sorry für die Frage…

dp20eic · 20. Juni 2024 um 18:33

Moin,

Wieso aus HA heraus starten, kann man machen, ist doch aber nur einmalig zu machen, bis alle Daten von influxDB hinzu Victoria Metrics migriert sind.

wenn Du Dir das Skript im Ordner anschaust, dann hat es oft nur Lese und Schreibrechte für User, Gruppe und Other hat, Beispiel

-rw-r--r-- 1 dp20eic dp20eic    4685 19. Jun 19:11  influx_export.py

Um dann das Skript auszuführen, muss man das mit dem ausführenden Programm, hier Python tun, Beispiel

# python3 influxv2tovm.py -o influxdb2_org -u http://192.168.1.5:8086 -t  xxmytokenxxx -a http://homeassistant.local:8428 -n VM_meterReadings_1M
data: {}

Wenn das Skript ausführbar ist, kann man das Python vorne weglassen, Ausführbar sieht dann so aus

-rwxr-xr-x 1 dp20eic dp20eic    4685 19. Jun 19:11  influx_export.py

Ich denke aber, dass Du da nichts ändern musst, denn wenn es innerhalb von HA ausgeführt wird, steht python automatisch davor, man sieht es nicht, bzw. das pyskript ist ein Environment, welches die Skripte ausführbar macht.

Ich muss mal schauen, ob ich mir nicht mal eine Victoria Metrics aufbaue und das mal selbst teste, wenn ich Zeit habe, ist halt gerade nur schlecht wegen EM und Stiefsohn ist aus Thailand hier, daher nicht so viel Zeit.

VG
Bernd

Schtallone · 21. Juni 2024 um 10:36

Kein Streß! EM und Familie geht natürlich vor. Ich schaue mir das morgen auch nochmal an. Gruß

Hallo,
wir kommen der Sache näher. Ich habe das Script nun per Konsole im HA mit dem Befehl python 3 ausgeführt und jetzt wird es auch ausgeführt. Es gibt jedoch noch eine Fehlermeldung, welche ich nicht recht verstehe…

hier das Log soweit. Ein erwartetes Datenformat bei dem Sammeln der Measurments scheint nicht zu passen

Dry run True Pivot False
Finding unique time series.
Traceback (most recent call last):
  File "/homeassistant/pyscript/influxv2tovm.py", line 343, in <module>
    main(vars(parser.parse_args()))
  File "/homeassistant/pyscript/influxv2tovm.py", line 268, in main
    migrator.migrate()
  File "/homeassistant/pyscript/influxv2tovm.py", line 95, in migrate
    measurements_and_fields = self.__find_all_measurements()
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/homeassistant/pyscript/influxv2tovm.py", line 188, in __find_all_measurements
    measurements_and_fields.update(df[self.__measurement_key].unique())
                                   ~~^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: string indices must be integers, not 'str'
Exception ignored in: <function InfluxMigrator.__del__ at 0x7fbcd6f58540>
Traceback (most recent call last):
  File "/homeassistant/pyscript/influxv2tovm.py", line 78, in __del__
    self.__progress_file.close()
    ^^^^^^^^^^^^^^^^^^^^
AttributeError: 'InfluxMigrator' object has no attribute '_InfluxMigrator__progress_file'

Hier das Script, so wie ich es aktuell ausführe (ich habe keine Änderungen gemacht):

#!/usr/bin/env python3
"""
 @author Fredrik Lilja

 SPDX-License-Identifier: Apache-2.0
"""
import datetime
import logging
import os
import warnings
from typing import Iterable, Dict, List

import humanize
import pandas as pd
import requests
from influxdb_client import InfluxDBClient, QueryApi
from influxdb_client.client.warnings import MissingPivotFunction

warnings.simplefilter("ignore", MissingPivotFunction)

# Create a custom logger
logger = logging.getLogger(__name__)
# noinspection SpellCheckingInspection
logging.basicConfig(filename="migrator.log", encoding="utf-8", level=logging.DEBUG,
                    format='%(asctime)s - %(name)s - %(levelname)s -  %(message)s')

try:
    # noinspection PyUnresolvedReferences
    import dotenv

    dotenv.load_dotenv(dotenv_path=".env")
except ImportError as err:
    pass


class Stats:
    bytes: int = 0
    lines: int = 0

    def humanized_bytes(self) -> str:
        """
        Get the number of bytes as natural size.
        :return: str
        """
        return humanize.naturalsize(self.bytes)

    def increment(self, lines: str):
        """
        Increments the number of bytes and the number of lines from a string.
        :param lines: lines string
        """
        no_lines = lines.count('\n')
        self.lines = self.lines + no_lines

        new_bytes = len(lines.encode("utf8"))
        self.bytes += new_bytes


class InfluxMigrator:
    __query_api: QueryApi
    __measurement_key = "_measurement"
    __client: InfluxDBClient

    # noinspection SpellCheckingInspection
    def __init__(self, bucket: str, vm_url: str, chunksize: int = 100, dry_run: bool = False, pivot: bool = False):
        self.bucket = bucket
        self.vm_url: str = vm_url
        self.chunksize = chunksize
        # now_datetime_str = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
        # self.__progress_file = open(f".migrator_{now_datetime_str}", 'w')
        self.stats = Stats()
        self.dry_run = dry_run
        self.pivot = pivot
        if pivot:
            self.__measurement_key = "entity_id"

    def __del__(self):
        self.__progress_file.close()
        self.__client.close()

    def influx_connect(self):
        """
        Connects to the influx database.
        """
        self.__client = InfluxDBClient.from_env_properties()
        self.__query_api = self.__client.query_api()

    def migrate(self):

        if self.__query_api is None:
            raise AssertionError("No connection to InfluxDb started.")

        # Get all unique series by reading first entry of every table.
        # With latest InfluxDB we could possibly use "schema.measurements()" but this doesn't exist in 2.0
        measurements_and_fields = self.__find_all_measurements()

        field_no = 1
        for meas in measurements_and_fields:
            no_lines = 0

            chunk_query = f"""
                        from(bucket: "{self.bucket}")
                        |> range(start: -100d, stop: now())
                        |> filter(fn: (r) => r["{self.__measurement_key}"] == "{meas}")
                        |> limit(n: {self.chunksize}, offset: _offset)
                        """
            df_empty = False
            offset = 0
            while not df_empty:
                params = {"_offset": offset}
                result = self.__query_api.query_data_frame(chunk_query, params=params)
                if type(result) is not list:
                    result: list = [result]
                else:
                    print("It's a list")

                for df in result:
                    df_empty = df.empty
                    if df_empty:
                        break

                    # Increase offset with the number of rows in the DataFrame.
                    offset += df.shape[0]

                    assert (type(df) is pd.DataFrame)
                    lines_protocol_str = self.__get_influxdb_lines(df)
                    self.stats.increment(lines_protocol_str)
                    no_lines += lines_protocol_str.count('\n') + 1

                    if not self.dry_run:
                        requests.post(f"{self.vm_url}/write?db={self.bucket}", data=lines_protocol_str)
                    else:
                        print(lines_protocol_str)

                    print(
                        f"Wrote {no_lines} lines "
                        f"bytes to VictoriaMetrics db={self.bucket} for {meas}. "
                        f"Total: {self.stats.humanized_bytes()} "
                        f"({field_no}/{len(measurements_and_fields)})",
                        end='\r')
            field_no += 1

    @staticmethod
    def __whitelist_measurements(measurements_and_fields: List) -> List[tuple]:
        """
        Applies a whitelist to the list of measurements and fields. Does nothing if no whitelist is found.

        :param measurements_and_fields :
        :return:  the new measurements and fields tuple list with the whitelist applied.
        """
        whitelist: List[tuple] = []
        whitelist_path = "whitelist.txt"
        if os.path.exists(whitelist_path):
            try:
                with open(whitelist_path, 'r') as f:
                    whitelist_rows = f.read().splitlines()

                    for row_str in whitelist_rows:
                        row = row_str.split(' ')
                        if len(row) > 3:
                            tup: tuple = row[1], row[2]
                            whitelist.append(tup)
            except OSError:
                print("Problem reading whitelist. Skipping")

            if len(whitelist) > 0:
                m_a_f_set = set(measurements_and_fields)
                whitelist_set = set(whitelist)
                measurements_and_fields = list(set.intersection(m_a_f_set, whitelist_set))

        return measurements_and_fields

    def __find_all_measurements(self):
        """
        Finds all permutations of measurements and fields.
        :return: a list of tuples
        """

        print("Finding unique time series.")
        first_in_series = f"""
           from(bucket: "{self.bucket}")
           |> range(start: 0, stop: now())
           |> first()"""
        timeseries: List[pd.DataFrame] = self.__query_api.query_data_frame(first_in_series)

        measurements_and_fields = set()
        for df in timeseries:
            measurements_and_fields.update(df[self.__measurement_key].unique())

        print(f"Found {len(measurements_and_fields)} unique time series")
        return measurements_and_fields

    @staticmethod
    def __get_tag_cols(dataframe_keys: Iterable) -> Iterable:
        """
        Filter out dataframe keys that are not tags

        @param dataframe_keys:
        @return:
        """
        return (
            k
            for k in dataframe_keys
            if not k.startswith("_") and k not in ["result", "table"]
        )

    def __get_influxdb_lines(self, df: pd.DataFrame) -> str:
        """
        Convert the Pandas Dataframe into InfluxDB line protocol.

        The dataframe should be similar to results received from query_api.query_data_frame()

        Not quite sure if this supports all kinds if InfluxDB schemas.
        It might be that influxdb_client package could be used as an alternative to this,
        but I'm not sure about the authorizations and such.

        Protocol description: https://docs.influxdata.com/influxdb/v2.0/reference/syntax/line-protocol/
        """
        logger.info(f"Exporting {df.columns}")

        if df.empty:
            logger.debug(f"No data points for this")
            return ""

        line: str
        # Only applies to Homeassistant data migration.
        # self.__pivot guides if this is straight conversion/export or pivoting the measurements into
        # unit and having the entity ids as measurements.
        if self.pivot:
            line = df["entity_id"]
            line = df["domain"] + "." + line
        else:
            line = df["_measurement"]

        for col_name in self.__get_tag_cols(df):
            line += ("," + col_name + "=") + df[col_name].astype(str)

        if self.pivot:
            line += ("," + "unit_of_measurement=") + df["_measurement"].astype(str)

        line += (
                " "
                + df["_field"]
                + "="
                + df["_value"].astype(str)
                + " "
                + df["_time"].astype(int).astype(str)
        )
        return "\n".join(line)


def main(args: Dict[str, str]):
    logger.info("args: " + str(args.keys()))
    bucket = args.pop("bucket")
    vm_url = args.pop("vm_addr")
    dry_run = bool(args.pop("dry_run"))
    pivot = bool(args.pop("pivot"))

    print(f"Dry run {dry_run} Pivot {pivot}")

    for k, v in args.items():
        if v is not None:
            os.environ[k] = v
        logger.info(f"Using {k}={os.getenv(k)}")

    migrator = InfluxMigrator(bucket, vm_url, chunksize=5000, dry_run=dry_run, pivot=pivot)
    migrator.influx_connect()
    migrator.migrate()


if __name__ == "__main__":
    import argparse

    parser = argparse.ArgumentParser(
        description="Script for exporting InfluxDB data into victoria metrics instance. \n"
                    " InfluxDB settings can be defined on command line or as environment variables"
                    " (or in .env file if python-dotenv is installed)."
                    " InfluxDB related args described in \n"
                    "https://github.com/influxdata/influxdb-client-python#via-environment-properties"
    )
    parser.add_argument(
        "bucket",
        type=str,
        help="InfluxDB source bucket",
    )
    parser.add_argument(
        "--INFLUXDB_V2_ORG",
        "-o",
        type=str,
        help="InfluxDB organization",
    )
    parser.add_argument(
        "--INFLUXDB_V2_URL",
        "-u",
        type=str,
        help="InfluxDB Server URL, e.g., http://localhost:8086",
    )
    parser.add_argument(
        "--INFLUXDB_V2_TOKEN",
        "-t",
        type=str,
        help="InfluxDB access token.",
    )
    parser.add_argument(
        "--INFLUXDB_V2_SSL_CA_CERT",
        "-S",
        type=str,
        help="Server SSL Cert",
    )
    parser.add_argument(
        "--INFLUXDB_V2_TIMEOUT",
        "-T",
        type=str,
        help="InfluxDB timeout",
    )
    parser.add_argument(
        "--INFLUXDB_V2_VERIFY_SSL",
        "-V",
        type=str,
        help="Verify SSL CERT.",
    )
    parser.add_argument(
        "--vm-addr",
        "-a",
        type=str,
        help="VictoriaMetrics server",
    )
    parser.add_argument(
        "--dry-run",
        "-n",
        action='store_true',
        default=False,
        help="Dry run",
    )
    parser.add_argument(
        "--pivot",
        "-P",
        action='store_true',
        default=False,
        help="Pivot entity_id to be measurement",
    )

    main(vars(parser.parse_args()))
    print("All done")

Hier ein Auszug aus dem Bucket, welchen ich migrieren möchte:

#group
false
false
true
true
false
false
true
true
#datatype
string
long
dateTime:RFC3339
dateTime:RFC3339
dateTime:RFC3339
double
string
string
#default
_result
result
table
_start
_stop
_time
_value
_field
_measurement
0
2022-10-31T23:00:00Z
2024-06-21T10:24:19.549112938Z
2023-06-20T22:00:00Z
2.8013339999999998
value
Backofen
0
2022-10-31T23:00:00Z
2024-06-21T10:24:19.549112938Z
2023-06-21T22:00:00Z
2.8013339999999998
value
Backofen
0
2022-10-31T23:00:00Z
2024-06-21T10:24:19.549112938Z
2023-06-22T22:00:00Z
2.8013339999999998
value
Backofen
0
2022-10-31T23:00:00Z
2024-06-21T10:24:19.549112938Z
2023-06-23T22:00:00Z
2.8013339999999998
value
Backofen
0
2022-10-31T23:00:00Z
2024-06-21T10:24:19.549112938Z
2023-06-24T22:00:00Z
2.8013339999999998
value
Backofen
0
2022-10-31T23:00:00Z
2024-06-21T10:24:19.549112938Z
2023-06-25T22:00:00Z
2.8013339999999998
value
Backofen
0
2022-10-31T23:00:00Z
2024-06-21T10:24:19.549112938Z
2023-06-26T22:00:00Z
2.8013339999999998
value
Backofen

Gruß

dp20eic · 21. Juni 2024 um 11:04

Moin,

Das was mich die ganze Zeit schon irritiert ist, dass Du nirgends eine Datenbank (Bucket) angibst, was ja nicht sein kann, denn in der Beschreibung steht ja

./influx_export.py -h
usage: influxv2tovm.py [-h] 
 [--INFLUXDB_V2_ORG INFLUXDB_V2_ORG] 
 [--INFLUXDB_V2_URL INFLUXDB_V2_URL] 
 [--INFLUXDB_V2_TOKEN INFLUXDB_V2_TOKEN]
 [--INFLUXDB_V2_SSL_CA_CERT INFLUXDB_V2_SSL_CA_CERT] 
 [--INFLUXDB_V2_TIMEOUT INFLUXDB_V2_TIMEOUT] 
 [--INFLUXDB_V2_VERIFY_SSL INFLUXDB_V2_VERIFY_SSL]
 [--vm-addr VM_ADDR]
 
bucket

Wobei

bucket = dem Bucket in der influxDB entspricht, das Du migrierenwillst

Dieser Parameter muss als letzes stehen, da er Positionsabhängig ist.

VG
Bernd

Schtallone · 21. Juni 2024 um 12:38

Hi,

den Bucket habe ich angeben:

# python3 influxv2tovm.py -o influxdb2_org -u http://192.168.1.5:8086 -t  xxmytokenxxx -a http://homeassistant.local:8428 -n VM_meterReadings_1M
data: {}

Das ist der Bucket

VM_meterReadings_1M

das Attribut “-n” ist für den “Dry-Run”, also ohne das erstmal Daten geschrieben werden

optional arguments:
  -h, --help            show this help message and exit
  --INFLUXDB_V2_ORG INFLUXDB_V2_ORG, -o INFLUXDB_V2_ORG
                        InfluxDB organization
  --INFLUXDB_V2_URL INFLUXDB_V2_URL, -u INFLUXDB_V2_URL
                        InfluxDB Server URL, e.g., http://localhost:8086
  --INFLUXDB_V2_TOKEN INFLUXDB_V2_TOKEN, -t INFLUXDB_V2_TOKEN
                        InfluxDB access token.
  --INFLUXDB_V2_SSL_CA_CERT INFLUXDB_V2_SSL_CA_CERT, -S INFLUXDB_V2_SSL_CA_CERT
                        Server SSL Cert
  --INFLUXDB_V2_TIMEOUT INFLUXDB_V2_TIMEOUT, -T INFLUXDB_V2_TIMEOUT
                        InfluxDB timeout
  --INFLUXDB_V2_VERIFY_SSL INFLUXDB_V2_VERIFY_SSL, -V INFLUXDB_V2_VERIFY_SSL
                        Verify SSL CERT.
  --vm-addr VM_ADDR, -a VM_ADDR
                        VictoriaMetrics server
  --dry-run, -n         Dry run, don't write changes to VM
  --pivot, -P           Pivot entity_id to be measurement, this is specific to Homeassistant metrics

dp20eic · 21. Juni 2024 um 13:36

Moin,

Das ist wirklich der Name des Bucket in influxDB?
Du siehst in der influxDB UI wie das Bucket heißt

VG
Bernd

Schtallone · 21. Juni 2024 um 16:41

aber ja doch

dp20eic · 21. Juni 2024 um 20:21

Moin,

Ok, dann bin ich erst einmal raus, bis ich vielleicht Zeit gefunden habe, das mal hier selbst nachzustellen.

VG
Bernd

dp20eic · 23. Juni 2024 um 11:22

Moin,

Mal ein Selbstgespräch,

also ich habe mir Victoria Metrics, als LXC unter Proxmox installiert.

dann habe ich auf dem Victoria Metrics Server ein virtuelles Python Environment angelegt und aktiviert

# mkdir -p /tmp/influx2vm
# python3 -m venv /tmp/influx2vm
# cd /tmp/influx2vm
# source bin/activate
#

dann habe ich das Skript von Github in das Verzeichnis heruntergeladen und ausführbar gemacht
```
# chmod 755 linfluxv2tovm.py
```
beim ersten Aufruf wurden noch einige fehlende Python Pakete angemeckert, die ich dann nachinstalliert habe
```
# bin/pip install ...
```

dann habe ich das Skript einmal so aufgerufen, das das bei mir zu Formatiert ist, ist nur der Übersicht geschuldet, kann man so machen, alles in eine Zeile, ohne \, geht auch.

# ./influxv2tovm.py -o fritzinfluxdb \
 -u influxdb.fritz.box:8086 \
 -t VWyzh43W33FsLI34ms33MPdBWjN35IRvQCMyTmG36aQ-E37tmViJ38nRL33J33Bm39PozdbF63YCHxrRqSe3l_eW_33tDj33VAw== \
 -a http://localhost:8428 \
 -n \ 
fritzinfluxdb

Das ist der dry run der lief dann in ein Time out.

❯ ./influxv2tovm.py -o fritzinfluxdb -u influxdb.fritz.box:8086 -t VXyzh9X7FsLI18ms1MPdBXjN0IRvQCMyTmG5aQ-E9tmViJ3nRL8J1Bm9PozdbF4YCHxrRqSe0l_eX_7tDj7VAw== -a http://localhost:8428 -n fritzinfluxdb
Dry run True Pivot False
Finding unique time series.
Found 1 unique time series
zsh: killed     ./influxv2tovm.py -o fritzinfluxdb -u influxdb.fritz.box:8086 -t  -a  -n

dann habe ich mir gesagt, was solls, wer nicht wagt, der nicht gewinnt und noch mal ohne -n aufgerufen,

❯ ./influxv2tovm.py -o fritzinfluxdb -u influxdb.fritz.box:8086 -t VXyzh9X7FsLI18ms1MPdBXjN0IRvQCMyTmG5aQ-E9tmViJ3nRL8J1Bm9PozdbF4YCHxrRqSe0l_eX_7tDj7VAw== -a http://localhost:8428 fritzinfluxdb
Dry run False Pivot False
Finding unique time series.
Found 1 unique time series

Aktuell steht er da, mal sehen, ob es auch wieder ein Timeout gibt, dann muss ich schauen wie man größere Buckets in kleine Scheiben schneidet, um die zu übertragen.

VG
Bernd

P.S.: Ist durchgelaufen, muss mich mal damit beschäftigen, wie ich da Daten abfragen muss.
P.P.S.: ist doch nicht durchgelaufen, ging irgendwann in einen Timeout und der Prozess wurde gekillt.

Ich mache dann hier erst einmal nicht weiter, da ich keine Zeit habe, mir da anzuschauen, wieso das Krachen geht.