Interface PipelinesHistoryService

    • Method Detail

      • getPipelineProcess

        PipelineProcess getPipelineProcess​(@NotNull
                                           @NotNull UUID datasetKey,
                                           int attempt)
        Gets the PipelineProcess identified by the dataset and attempt identifiers.
        Parameters:
        datasetKey - dataset identifier
        attempt - crawl attempt identifier
        Returns:
        an instance of pipelines process if exists.
      • createPipelineProcess

        long createPipelineProcess​(@NotNull
                                   @NotNull PipelineProcessParameters params)
        Creates/persists a pipelines process of dataset for an attempt identifier. If the process already exists it returns the existing one.
        Parameters:
        params - pipeline process parameters, contain dataset key and attempt
        Returns:
        the key of the PipelineProcess created.
      • addPipelineExecution

        long addPipelineExecution​(long processKey,
                                  @NotNull
                                  @NotNull PipelineExecution pipelineExecution)
        Adds/persists the information of a pipeline execution.
        Parameters:
        processKey - sequential identifier of a pipeline process
        pipelineExecution - pipeline execution data
        Returns:
        the key of the PipelineExecution created.
      • getRunningExecutionKey

        Long getRunningExecutionKey​(@NotNull
                                    @NotNull UUID datasetKey)
        Gets execution key for running dataset
        Parameters:
        datasetKey - dataset identifier
        Returns:
        running execution key
      • updatePipelineStep

        long updatePipelineStep​(@NotNull
                                @NotNull PipelineStep pipelineStep)
        Update the information of a pipeline step.
        Parameters:
        pipelineStep - step to be added
        Returns:
        the key of the PipelineStep created.
      • markPipelineExecutionIfFinished

        void markPipelineExecutionIfFinished​(long executionKey)
        Mark pipeline execution as finished when all pipelin steps are finished
        Parameters:
        executionKey - key of the pipeline execution
      • markPipelineStatusAsAborted

        void markPipelineStatusAsAborted​(long executionKey)
        Change status to ABORTED and set finished date if state is RUNNING, QUEUED or SUBMITTED, and set pipeline execution as finished
        Parameters:
        executionKey - key of the pipeline execution
      • runAll

        RunPipelineResponse runAll​(@NotBlank(message="Steps parameter is required")
                                   @NotBlank(message="Steps parameter is required") String steps,
                                   @NotBlank(message="Reason parameter is required")
                                   @NotBlank(message="Reason parameter is required") String reason,
                                   boolean useLastSuccessful,
                                   boolean markPreviousAttemptAsFailed,
                                   @Nullable
                                   RunAllParams runAllParams,
                                   @Nullable
                                   Set<String> interpretTypes)
        Runs the last attempt for all datasets.
        Parameters:
        steps - steps to run
        reason - reason to run
        useLastSuccessful - true if we want to run the latest successful attempt
        markPreviousAttemptAsFailed - previous status can't be wrong, when CLI restarted during processing a dataset
        runAllParams - parameters, contain datasets to exclude
        interpretTypes - is used for partial interpretation such as only TAXONOMY, METADATA and etc
        Returns:
        RunPipelineResponse.
      • runPipelineAttempt

        RunPipelineResponse runPipelineAttempt​(@NotNull
                                               @NotNull UUID datasetKey,
                                               @NotBlank(message="Steps parameter is required")
                                               @NotBlank(message="Steps parameter is required") String steps,
                                               @NotBlank(message="Reason parameter is required")
                                               @NotBlank(message="Reason parameter is required") String reason,
                                               boolean useLastSuccessful,
                                               boolean markPreviousAttemptAsFailed,
                                               @Nullable
                                               Set<String> interpretTypes)
        Restart last failed pipelines step for a dataset.
        Parameters:
        datasetKey - dataset key
        steps - steps to run
        reason - reason to run
        useLastSuccessful - true if we want to run the latest successful attempt
        markPreviousAttemptAsFailed - previous status can't be wrong, when CLI restarted during processing a dataset
        interpretTypes - is used for partial interpretation such as only TAXONOMY, METADATA and etc
        Returns:
        RunPipelineResponse.
      • runPipelineAttempt

        RunPipelineResponse runPipelineAttempt​(@NotNull
                                               @NotNull UUID datasetKey,
                                               int attempt,
                                               @NotBlank(message="Steps parameter is required")
                                               @NotBlank(message="Steps parameter is required") String steps,
                                               @NotBlank(message="Reason parameter is required")
                                               @NotBlank(message="Reason parameter is required") String reason,
                                               boolean markPreviousAttemptAsFailed,
                                               @Nullable
                                               Set<String> interpretTypes)
        Re-run a pipeline step.
        Parameters:
        datasetKey - dataset key
        attempt - attempt to run
        steps - steps to run
        reason - reason to run
        markPreviousAttemptAsFailed - previous status can't be wrong, when CLI restarted during processing a dataset
        interpretTypes - is used for partial interpretation such as only TAXONOMY, METADATA and etc
        Returns:
        RunPipelineResponse.
      • sendAbsentIndentifiersEmail

        @Deprecated
        void sendAbsentIndentifiersEmail​(@NotNull
                                         @NotNull UUID datasetKey,
                                         int attempt,
                                         @NotNull
                                         @NotNull String message)
        Deprecated.
        Sends email to data administrator about absent identifiers issue with a dataset

        Deprecated: use #notifyAbsentIdentifiers(UUID, int, String) instead.

        Parameters:
        datasetKey - dataset key
        attempt - attempt to run
        message - with failed metrics and other info*
      • allowAbsentIndentifiers

        void allowAbsentIndentifiers​(@NotNull
                                     @NotNull UUID datasetKey,
                                     int attempt)
        Mark failed identifier stage as finished and continue interpretation process for datasets were identifier stage failed because of a threshold limit
        Parameters:
        datasetKey - dataset key
        attempt - attempt to run
      • allowAbsentIndentifiers

        void allowAbsentIndentifiers​(@NotNull
                                     @NotNull UUID datasetKey)
        Mark latest failed identifier stage as finished and continue interpretation process for datasets were identifier stage failed because of a threshold limit
        Parameters:
        datasetKey - dataset key
      • notifyAbsentIdentifiers

        void notifyAbsentIdentifiers​(UUID datasetKey,
                                     int attempt,
                                     long executionKey,
                                     String message)
        Sends a notification to the data administrators about absent identifiers issues with the dataset.
        Parameters:
        datasetKey - key of the dataset
        attempt - crawling attempt
        executionKey - key of the pipelines execution
        message - cause of the issue