Interface PipelinesHistoryService


public interface PipelinesHistoryService
  • Field Details

  • Method Details

    • history

      Lists the history of all PipelineProcess, sorted descending from the most recent one.
      Parameters:
      pageable - paging request
      Returns:
      a paged response that contains a list of PipelineProcess.
    • history

      PagingResponse<PipelineProcess> history(@NotNull @NotNull UUID datasetKey, Pageable pageable)
      Lists the history of all PipelineProcess of a dataset, sorted descending from the most recent one.
      Parameters:
      datasetKey - dataset identifier
      pageable - paging request
      Returns:
      a paged response that contains a list of PipelineProcess.
    • getPipelineProcess

      PipelineProcess getPipelineProcess(@NotNull @NotNull UUID datasetKey, int attempt)
      Gets the PipelineProcess identified by the dataset and attempt identifiers.
      Parameters:
      datasetKey - dataset identifier
      attempt - crawl attempt identifier
      Returns:
      an instance of pipelines process if exists.
    • getRunningPipelineProcess

      Returns information about all running pipelines executions
    • createPipelineProcess

      long createPipelineProcess(@NotNull @NotNull PipelineProcessParameters params)
      Creates/persists a pipelines process of dataset for an attempt identifier. If the process already exists it returns the existing one.
      Parameters:
      params - pipeline process parameters, contain dataset key and attempt
      Returns:
      the key of the PipelineProcess created.
    • addPipelineExecution

      long addPipelineExecution(long processKey, @NotNull @NotNull PipelineExecution pipelineExecution)
      Adds/persists the information of a pipeline execution.
      Parameters:
      processKey - sequential identifier of a pipeline process
      pipelineExecution - pipeline execution data
      Returns:
      the key of the PipelineExecution created.
    • getRunningExecutionKey

      Long getRunningExecutionKey(@NotNull @NotNull UUID datasetKey)
      Gets execution key for running dataset
      Parameters:
      datasetKey - dataset identifier
      Returns:
      running execution key
    • updatePipelineStep

      long updatePipelineStep(@NotNull @NotNull PipelineStep pipelineStep)
      Update the information of a pipeline step.
      Parameters:
      pipelineStep - step to be added
      Returns:
      the key of the PipelineStep created.
    • getPipelineStep

      Gets the PipelineStep of the specified keys.
      Parameters:
      stepKey - key of the pipeline step
      Returns:
      PipelineStep.
    • getPipelineStepsByExecutionKey

      Gets the PipelineSteps list of the execution key.
      Parameters:
      executionKey - key of the pipeline execution
      Returns:
      List<PipelineStep>.
    • markAllPipelineExecutionAsFinished

      Mark all pipeline executions as finished to clean running UI
    • markPipelineExecutionIfFinished

      void markPipelineExecutionIfFinished(long executionKey)
      Mark pipeline execution as finished when all pipelin steps are finished
      Parameters:
      executionKey - key of the pipeline execution
    • markPipelineStatusAsAborted

      void markPipelineStatusAsAborted(long executionKey)
      Change status to ABORTED and set finished date if state is RUNNING, QUEUED or SUBMITTED, and set pipeline execution as finished
      Parameters:
      executionKey - key of the pipeline execution
    • runAll

      RunPipelineResponse runAll(@NotBlank(message="Steps parameter is required") @NotBlank(message="Steps parameter is required") String steps, @NotBlank(message="Reason parameter is required") @NotBlank(message="Reason parameter is required") String reason, boolean useLastSuccessful, boolean markPreviousAttemptAsFailed, @Nullable RunAllParams runAllParams, @Nullable Set<String> interpretTypes)
      Runs the last attempt for all datasets.
      Parameters:
      steps - steps to run
      reason - reason to run
      useLastSuccessful - true if we want to run the latest successful attempt
      markPreviousAttemptAsFailed - previous status can't be wrong, when CLI restarted during processing a dataset
      runAllParams - parameters, contain datasets to exclude
      interpretTypes - is used for partial interpretation such as only TAXONOMY, METADATA and etc
      Returns:
      RunPipelineResponse.
    • runPipelineAttempt

      RunPipelineResponse runPipelineAttempt(@NotNull @NotNull UUID datasetKey, @NotBlank(message="Steps parameter is required") @NotBlank(message="Steps parameter is required") String steps, @NotBlank(message="Reason parameter is required") @NotBlank(message="Reason parameter is required") String reason, boolean useLastSuccessful, boolean markPreviousAttemptAsFailed, @Nullable Set<String> interpretTypes)
      Restart last failed pipelines step for a dataset.
      Parameters:
      datasetKey - dataset key
      steps - steps to run
      reason - reason to run
      useLastSuccessful - true if we want to run the latest successful attempt
      markPreviousAttemptAsFailed - previous status can't be wrong, when CLI restarted during processing a dataset
      interpretTypes - is used for partial interpretation such as only TAXONOMY, METADATA and etc
      Returns:
      RunPipelineResponse.
    • runPipelineAttempt

      RunPipelineResponse runPipelineAttempt(@NotNull @NotNull UUID datasetKey, int attempt, @NotBlank(message="Steps parameter is required") @NotBlank(message="Steps parameter is required") String steps, @NotBlank(message="Reason parameter is required") @NotBlank(message="Reason parameter is required") String reason, boolean markPreviousAttemptAsFailed, @Nullable Set<String> interpretTypes)
      Re-run a pipeline step.
      Parameters:
      datasetKey - dataset key
      attempt - attempt to run
      steps - steps to run
      reason - reason to run
      markPreviousAttemptAsFailed - previous status can't be wrong, when CLI restarted during processing a dataset
      interpretTypes - is used for partial interpretation such as only TAXONOMY, METADATA and etc
      Returns:
      RunPipelineResponse.
    • sendAbsentIndentifiersEmail

      @Deprecated void sendAbsentIndentifiersEmail(@NotNull @NotNull UUID datasetKey, int attempt, @NotNull @NotNull String message)
      Deprecated.
      Sends email to data administrator about absent identifiers issue with a dataset

      Deprecated: use #notifyAbsentIdentifiers(UUID, int, String) instead.

      Parameters:
      datasetKey - dataset key
      attempt - attempt to run
      message - with failed metrics and other info*
    • allowAbsentIndentifiers

      void allowAbsentIndentifiers(@NotNull @NotNull UUID datasetKey, int attempt)
      Mark failed identifier stage as finished and continue interpretation process for datasets were identifier stage failed because of a threshold limit
      Parameters:
      datasetKey - dataset key
      attempt - attempt to run
    • allowAbsentIndentifiers

      void allowAbsentIndentifiers(@NotNull @NotNull UUID datasetKey)
      Mark latest failed identifier stage as finished and continue interpretation process for datasets were identifier stage failed because of a threshold limit
      Parameters:
      datasetKey - dataset key
    • notifyAbsentIdentifiers

      void notifyAbsentIdentifiers(UUID datasetKey, int attempt, long executionKey, String message)
      Sends a notification to the data administrators about absent identifiers issues with the dataset.
      Parameters:
      datasetKey - key of the dataset
      attempt - crawling attempt
      executionKey - key of the pipelines execution
      message - cause of the issue