Oracle 11g: DBMS_PARALLEL_EXECUTE
2011-06-09 13:49 Tracy. 阅读(1399) 评论(0) 收藏 举报DBMS_PARALLEL_EXECUTE
TheDBMS_PARALLEL_EXECUTE package allows a workload associated with a base table to be broken down into smaller chunks which can be run in parallel. This process involves several distinct stages. At the end of the article there are some complete examples, using some of the techniques discussed below. The user controlling the process needs the
CREATE JOB privilege. The examples used in this article require the following table to be created and populated.CONN / AS SYSDBA GRANT CREATE JOB TO test;
CONN test/test
DROP TABLE test_tab;
CREATE TABLE test_tab (
id NUMBER,
description VARCHAR2(50),
num_col NUMBER,
CONSTRAINT test_tab_pk PRIMARY KEY (id)
);
INSERT /*+ APPEND */ INTO test_tab
SELECT level,
'Description for ' || level,
CASE
WHEN MOD(level, 5) = 0 THEN 10
WHEN MOD(level, 3) = 0 THEN 20
ELSE 30
END
FROM dual
CONNECT BY level <= 500000;
COMMIT;
EXEC DBMS_STATS.gather_table_stats(USER, 'TEST_TAB', cascade => TRUE);
SELECT num_col, COUNT(*)
FROM test_tab
GROUP BY num_col
ORDER BY num_col;
NUM_COL COUNT(*)
---------- ----------
10 100000
20 133333
30 266667
SQL>
Create a task
TheCREATE_TASK procedure is used to create a new task. It requires a task name to be specified, but can also include an optional task comment.
Information about existing tasks is displayed using theBEGIN DBMS_PARALLEL_EXECUTE.create_task (task_name => 'test_task'); END; /
[DBA|USER]_PARALLEL_EXECUTE_TASKS views.
COLUMN task_name FORMAT A10
SELECT task_name,
status
FROM user_parallel_execute_tasks;
TASK_NAME STATUS
---------- -------------------
test_task CREATED
SQL>
The GENERATE_TASK_NAME function returns a unique task name if you do not want to name the task manually.
SELECT DBMS_PARALLEL_EXECUTE.generate_task_name FROM dual; GENERATE_TASK_NAME -------------------------------------------------------------------------------- TASK$_726 SQL>
Split the workload into chunks
The workload is associated with a base table, which can be split into subsets or chunks of rows. There are three methods of splitting the workload into chunks. The chunks associated with a task can be dropped using theDROP_CHUNKS procedure.
CREATE_CHUNKS_BY_ROWID
TheCREATE_CHUNKS_BY_ROWID procedure splits the data by rowid into chunks specified by the CHUNK_SIZE parameter. If the BY_ROW parameter is set to TRUE, the CHUNK_SIZE refers to the number of rows, otherwise it refers to the number of blocks.
BEGIN
DBMS_PARALLEL_EXECUTE.create_chunks_by_rowid(task_name => 'test_task',
table_owner => 'TEST',
table_name => 'TEST_TAB',
by_row => TRUE,
chunk_size => 10000);
END;
/
Once the operation is complete the task status is changed to 'CHUNKED'.
COLUMN task_name FORMAT A10
SELECT task_name,
status
FROM user_parallel_execute_tasks;
TASK_NAME STATUS
---------- -------------------
test_task CHUNKED
SQL>
The [DBA|USER]_PARALLEL_EXECUTE_CHUNKS views display information about the individual chunks.
SELECT chunk_id, status, start_rowid, end_rowid
FROM user_parallel_execute_chunks
WHERE task_name = 'test_task'
ORDER BY chunk_id;
CHUNK_ID STATUS START_ROWID END_ROWID
---------- -------------------- ------------------ ------------------
287 UNASSIGNED AAASjoAAEAAAAIwAAA AAASjoAAEAAAAI3CcP
288 UNASSIGNED AAASjoAAEAAAAI4AAA AAASjoAAEAAAAI/CcP
...
450 UNASSIGNED AAASjoAAEAAAAIIAAA AAASjoAAEAAAAIPCcP
451 UNASSIGNED AAASjoAAEAAAAIoAAA AAASjoAAEAAAAIvCcP
88 rows selected.
SQL>
CREATE_CHUNKS_BY_NUMBER_COL
TheCREATE_CHUNKS_BY_NUMBER_COL procedure divides the workload up based on a number column. It uses the specified columns min and max values along with the chunk size to split the data into approximately equal chunks. For the chunks to be equally sized the column must contain a continuous sequence of numbers, like that generated by a sequence.
BEGIN
DBMS_PARALLEL_EXECUTE.create_chunks_by_number_col(task_name => 'test_task',
table_owner => 'TEST',
table_name => 'TEST_TAB',
table_column => 'ID',
chunk_size => 10000);
END;
/
The [DBA|USER]_PARALLEL_EXECUTE_CHUNKS views display information about the individual chunks.
SELECT chunk_id, status, start_id, end_id
FROM user_parallel_execute_chunks
WHERE task_name = 'test_task'
ORDER BY chunk_id;
CHUNK_ID STATUS START_ID END_ID
---------- -------------------- ---------- ----------
600 UNASSIGNED 1 10000
601 UNASSIGNED 10001 20000
...
648 UNASSIGNED 480001 490000
649 UNASSIGNED 490001 500000
50 rows selected.
SQL>
CREATE_CHUNKS_BY_SQL
TheCREATE_CHUNKS_BY_SQL procedure divides the workload based on a user-defined query. If the BY_ROWID parameter is set to TRUE, the query must return a series of start and end rowids. If it's set to FALSE, the query must return a series of start and end IDs.
DECLARE
l_stmt CLOB;
BEGIN
l_stmt := 'SELECT DISTINCT num_col, num_col FROM test_tab';
DBMS_PARALLEL_EXECUTE.create_chunks_by_sql(task_name => 'test_task',
sql_stmt => l_stmt,
by_rowid => FALSE);
END;
/
The [DBA|USER]_PARALLEL_EXECUTE_CHUNKS views display information about the individual chunks.
SELECT chunk_id, status, start_id, end_id
FROM user_parallel_execute_chunks
WHERE task_name = 'test_task'
ORDER BY chunk_id;
CHUNK_ID STATUS START_ID END_ID
---------- -------------------- ---------- ----------
650 UNASSIGNED 10 10
651 UNASSIGNED 30 30
652 UNASSIGNED 20 20
3 rows selected.
SQL>
Run the task
Running a task involves running a specific statement for each defined chunk of work. The documentation only shows examples using updates of the base table, but this is not the only use of this functionality. The statement associated with the task can be a procedure call, as shown in one of the examples at the end of the article.There are two ways to run a task and several procedures to control a running task.
RUN_TASK
TheRUN_TASK procedure runs the specified statement in parallel by scheduling jobs to process the workload chunks. The statement specifying the actual work to be done must include a reference to the ':start_id' and ':end_id', which represent a range of rowids or column IDs to be processed, as specified in the chunk definitions. The degree of parallelism is controlled by the number of scheduled jobs, not the number of chunks defined. The scheduled jobs take an unassigned workload chunk, process it, then move on to the next unassigned chunk.
DECLARE
l_sql_stmt VARCHAR2(32767);
BEGIN
l_sql_stmt := 'UPDATE /*+ ROWID (dda) */ test_tab t
SET t.num_col = t.num_col + 10
WHERE rowid BETWEEN :start_id AND :end_id';
DBMS_PARALLEL_EXECUTE.run_task(task_name => 'test_task',
sql_stmt => l_sql_stmt,
language_flag => DBMS_SQL.NATIVE,
parallel_level => 10);
END;
/
The RUN_TASK procedure waits for the task to complete. On completion, the status of the task must be assessed to know what action to take next.
User-defined framework
TheDBMS_PARALLEL_EXECUTE package allows you to manually code the task run. The GET_ROWID_CHUNK and GET_NUMBER_COL_CHUNK procedures return the next available unassigned chunk. You can than manually process the chunk and set its status. The example below shows the processing of a workload chunked by rowid.
DECLARE
l_sql_stmt VARCHAR2(32767);
l_chunk_id NUMBER;
l_start_rowid ROWID;
l_end_rowid ROWID;
l_any_rows BOOLEAN;
BEGIN
l_sql_stmt := 'UPDATE /*+ ROWID (dda) */ test_tab t
SET t.num_col = t.num_col + 10
WHERE rowid BETWEEN :start_id AND :end_id';
LOOP
-- Get next unassigned chunk.
DBMS_PARALLEL_EXECUTE.get_rowid_chunk(task_name => 'test_task',
chunk_id => l_chunk_id,
start_rowid => l_start_rowid,
end_rowid => l_end_rowid,
any_rows => l_any_rows);
EXIT WHEN l_any_rows = FALSE;
BEGIN
-- Manually execute the work.
EXECUTE IMMEDIATE l_sql_stmt USING l_start_rowid, l_end_rowid;
-- Set the chunk status as processed.
DBMS_PARALLEL_EXECUTE.set_chunk_status(task_name => 'test_task',
chunk_id => l_chunk_id,
status => DBMS_PARALLEL_EXECUTE.PROCESSED);
EXCEPTION
WHEN OTHERS THEN
-- Record chunk error.
DBMS_PARALLEL_EXECUTE.set_chunk_status(task_name => 'test_task',
chunk_id => l_chunk_id,
status => DBMS_PARALLEL_EXECUTE.PROCESSED_WITH_ERROR,
err_num => SQLCODE,
err_msg => SQLERRM);
END;
-- Commit work.
COMMIT;
END LOOP;
END;
/
Task control
A running task can be stopped and restarted using theSTOP_TASK and RESUME_TASK procedures respectively.
The
PURGE_PROCESSED_CHUNKS procedure deletes all chunks with a status of 'PROCESSED' or 'PROCESSED_WITH_ERROR'.
The
ADM_DROP_CHUNKS, ADM_DROP_TASK, ADM_TASK_STATUS and ADM_STOP_TASK routines have the same function as their namesakes, but they allow the operations to performed on tasks owned by other users. In order to use these routines the user must have been granted the ADM_PARALLEL_EXECUTE_TASK role.
Check the task status
The simplest way to check the status of a task is to use theTASK_STATUS function. After execution of the task, the only possible return values are the 'FINISHED' or 'FINISHED_WITH_ERROR' constants. If the status is not 'FINISHED', then the task can be resumed using the RESUME_TASK procedure.
DECLARE
l_try NUMBER;
l_status NUMBER;
BEGIN
-- If there is error, RESUME it for at most 2 times.
l_try := 0;
l_status := DBMS_PARALLEL_EXECUTE.task_status('test_task');
WHILE(l_try < 2 and l_status != DBMS_PARALLEL_EXECUTE.FINISHED)
Loop
l_try := l_try + 1;
DBMS_PARALLEL_EXECUTE.resume_task('test_task');
l_status := DBMS_PARALLEL_EXECUTE.task_status('test_task');
END LOOP;
END;
/
The status of the task and the chunks can also be queried.
COLUMN task_name FORMAT A10
SELECT task_name,
status
FROM user_parallel_execute_tasks;
TASK_NAME STATUS
---------- -------------------
test_task FINISHED
SQL>
If there were errors, the chunks can be queried to identify the problems.
TheSELECT status, COUNT(*) FROM user_parallel_execute_chunks GROUP BY status ORDER BY status; STATUS COUNT(*) -------------------- ---------- PROCESSED 88 SQL>
[DBA|USER]_PARALLEL_EXECUTE_TASKS views contain a record of the JOB_PREFIX used when scheduling the chunks of work.
This value can be used to query information about the individual jobs used during the process. The number of jobs scheduled should match the degree of parallelism specified in theSELECT job_prefix FROM user_parallel_execute_tasks WHERE task_name = 'test_task'; JOB_PREFIX ------------------------------ TASK$_368 SQL>
RUN_TASK procedure.
COLUMN job_name FORMAT A20
SELECT job_name, status
FROM user_scheduler_job_run_details
WHERE job_name LIKE (SELECT job_prefix || '%'
FROM user_parallel_execute_tasks
WHERE task_name = 'test_task');
JOB_NAME STATUS
-------------------- ------------------------------
TASK$_368_1 SUCCEEDED
TASK$_368_6 SUCCEEDED
TASK$_368_2 SUCCEEDED
TASK$_368_9 SUCCEEDED
TASK$_368_10 SUCCEEDED
TASK$_368_8 SUCCEEDED
TASK$_368_7 SUCCEEDED
TASK$_368_4 SUCCEEDED
TASK$_368_5 SUCCEEDED
TASK$_368_3 SUCCEEDED
10 rows selected.
SQL>
Drop the task
Once the job is complete you can drop the task, which will drop the associated chunk information also.
BEGIN
DBMS_PARALLEL_EXECUTE.drop_task('test_task');
END;
/
Complete examples
The following example shows the processing of a workload chunked by rowid.
DECLARE
l_task VARCHAR2(30) := 'test_task';
l_sql_stmt VARCHAR2(32767);
l_try NUMBER;
l_status NUMBER;
BEGIN
DBMS_PARALLEL_EXECUTE.create_task (task_name => l_task);
DBMS_PARALLEL_EXECUTE.create_chunks_by_rowid(task_name => l_task,
table_owner => 'TEST',
table_name => 'TEST_TAB',
by_row => TRUE,
chunk_size => 10000);
l_sql_stmt := 'UPDATE /*+ ROWID (dda) */ test_tab t
SET t.num_col = t.num_col + 10
WHERE rowid BETWEEN :start_id AND :end_id';
DBMS_PARALLEL_EXECUTE.run_task(task_name => l_task,
sql_stmt => l_sql_stmt,
language_flag => DBMS_SQL.NATIVE,
parallel_level => 10);
-- If there is error, RESUME it for at most 2 times.
l_try := 0;
l_status := DBMS_PARALLEL_EXECUTE.task_status(l_task);
WHILE(l_try < 2 and l_status != DBMS_PARALLEL_EXECUTE.FINISHED)
Loop
l_try := l_try + 1;
DBMS_PARALLEL_EXECUTE.resume_task(l_task);
l_status := DBMS_PARALLEL_EXECUTE.task_status(l_task);
END LOOP;
DBMS_PARALLEL_EXECUTE.drop_task(l_task);
END;
/
The following example shows the processing of a workload chunked by a number column. Notice that the workload is actually a stored procedure in this case.
CREATE OR REPLACE PROCEDURE process_update (p_start_id IN NUMBER, p_end_id IN NUMBER) AS
BEGIN
UPDATE /*+ ROWID (dda) */ test_tab t
SET t.num_col = t.num_col + 10
WHERE id BETWEEN p_start_id AND p_end_id;
END;
/
DECLARE
l_task VARCHAR2(30) := 'test_task';
l_sql_stmt VARCHAR2(32767);
l_try NUMBER;
l_status NUMBER;
BEGIN
DBMS_PARALLEL_EXECUTE.create_task (task_name => l_task);
DBMS_PARALLEL_EXECUTE.create_chunks_by_number_col(task_name => l_task,
table_owner => 'TEST',
table_name => 'TEST_TAB',
table_column => 'ID',
chunk_size => 10000);
l_sql_stmt := 'BEGIN process_update(:start_id, :end_id); END;';
DBMS_PARALLEL_EXECUTE.run_task(task_name => l_task,
sql_stmt => l_sql_stmt,
language_flag => DBMS_SQL.NATIVE,
parallel_level => 10);
-- If there is error, RESUME it for at most 2 times.
l_try := 0;
l_status := DBMS_PARALLEL_EXECUTE.task_status(l_task);
WHILE(l_try < 2 and l_status != DBMS_PARALLEL_EXECUTE.FINISHED)
Loop
l_try := l_try + 1;
DBMS_PARALLEL_EXECUTE.resume_task(l_task);
l_status := DBMS_PARALLEL_EXECUTE.task_status(l_task);
END LOOP;
DBMS_PARALLEL_EXECUTE.drop_task(l_task);
END;
/
The following example shows a workload chunked by an SQL statement and processed by a user-defined framework.
DECLARE
l_task VARCHAR2(30) := 'test_task';
l_stmt CLOB;
l_sql_stmt VARCHAR2(32767);
l_chunk_id NUMBER;
l_start_id NUMBER;
l_end_id NUMBER;
l_any_rows BOOLEAN;
BEGIN
DBMS_PARALLEL_EXECUTE.create_task (task_name => l_task);
l_stmt := 'SELECT DISTINCT num_col, num_col FROM test_tab';
DBMS_PARALLEL_EXECUTE.create_chunks_by_sql(task_name => l_task,
sql_stmt => l_stmt,
by_rowid => FALSE);
l_sql_stmt := 'UPDATE /*+ ROWID (dda) */ test_tab t
SET t.num_col = t.num_col
WHERE num_col BETWEEN :start_id AND :end_id';
LOOP
-- Get next unassigned chunk.
DBMS_PARALLEL_EXECUTE.get_number_col_chunk(task_name => 'test_task',
chunk_id => l_chunk_id,
start_id => l_start_id,
end_id => l_end_id,
any_rows => l_any_rows);
EXIT WHEN l_any_rows = FALSE;
BEGIN
-- Manually execute the work.
EXECUTE IMMEDIATE l_sql_stmt USING l_start_id, l_end_id;
-- Set the chunk status as processed.
DBMS_PARALLEL_EXECUTE.set_chunk_status(task_name => 'test_task',
chunk_id => l_chunk_id,
status => DBMS_PARALLEL_EXECUTE.PROCESSED);
EXCEPTION
WHEN OTHERS THEN
-- Record chunk error.
DBMS_PARALLEL_EXECUTE.set_chunk_status(task_name => 'test_task',
chunk_id => l_chunk_id,
status => DBMS_PARALLEL_EXECUTE.PROCESSED_WITH_ERROR,
err_num => SQLCODE,
err_msg => SQLERRM);
END;
-- Commit work.
COMMIT;
END LOOP;
DBMS_PARALLEL_EXECUTE.drop_task(l_task);
END;
/
本文来自博客园,作者:Tracy.,转载请注明原文链接:https://www.cnblogs.com/tracy/archive/2011/06/09/2076324.html
浙公网安备 33010602011771号