Yann Neuhaus

Subscribe to Yann Neuhaus feed
dbi services technical blog
Updated: 1 week 4 days ago

dbvisit dbvctl process is terminating abnormally with Error Code: 2044

Fri, 2019-11-29 03:07

When applying archive logs on the standby, dbvctl process can terminate abnormally with Error Code: 2044. This can happen in case there are several archive logs with huge size to be applied.

Problem description

With dbvisit there is 2 ways to recover the archive log on the standby, either using sqlplus or rman. By default the configuration is set to sqlplus. It can happen that, following a maintenance windows where synchronization had to be suspended, a huge gap is faced between the primary and the standby databases. Several archive logs need to be applied. Problem is even more visible if the size of the archive log files is big. In my case there were about 34 archive logs to be applied following a maintenance activity and size of each file was 8 GB.

Applying the archive logs on the standby failed as seen in the following output.

oracle@server_name:/u01/app/dbvisit/standby/ [DB_name] ./dbvctl -d DDC_name
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 1939)
dbvctl started on server_name: Mon Oct 28 16:19:01 2019
=============================================================
 
 
>>> Applying Log file(s) from primary_server to DB_name on standby_server:
 
 
Dbvisit Standby terminated...
Error Code: 2044
File (/u01/app/dbvisit/standby/tmp/1939.dbvisit.201910281619.sqlplus.dbv) does not
exist or is empty. Please check space and file permissions.
 
Tracefile from server: server_name (PID:1939)
1939_dbvctl_DB_name_201910281619.trc
 
oracle@server_name:/u01/app/dbvisit/standby/ [DB_name] ls -l /u01/app/dbvisit/standby/tmp/1939.dbvisit.201910281619.sqlplus.dbv
ls: cannot access /u01/app/dbvisit/standby/tmp/1939.dbvisit.201910281619.sqlplus.dbv: No such file or directory

Solution

To solve this problem, you can change DDC configuration on the primary to use RMAN to apply archive log, at least time the gap is caught up. You will have to synchronize the standby configuration as well.
To use RMAN, set APPLY_ARCHIVE_RMAN parameter to Y in the DDC configuration file.

Procedure is described below :

Backup the DDC configuration file

oracle@server_name:/u01/app/dbvisit/standby/ [DB_name] cd conf
oracle@server_name:/u01/app/dbvisit/standby/conf/ [DB_name] cp -p dbv_DCC_name.env dbv_DCC_name.env.20191028

Change the parameter

oracle@server_name:/u01/app/dbvisit/standby/conf/ [DB_name] vi dbv_DCC_name.env
oracle@server_name:/u01/app/dbvisit/standby/conf/ [DB_name] diff dbv_DCC_name.env dbv_DCC_name.env.20191028
543c543
APPLY_ARCHIVE_RMAN = Y
---
APPLY_ARCHIVE_RMAN = N

Send the configuration changes to the standby

oracle@server_name:/u01/app/dbvisit/standby/conf/ [DB_name] cd ..
oracle@server_name:/u01/app/dbvisit/standby/ [DB_name] ./dbvctl -d DCC_name -C
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 9318)
dbvctl started on server_name: Mon Oct 28 16:51:45 2019
=============================================================
 
>>> Dbvisit Standby configurational differences found between primary_server and standby_server.
Synchronised.
 
=============================================================
dbvctl ended on server_name: Mon Oct 28 16:51:52 2019
=============================================================

Apply archive log on the standby again and it will be completed successfully

oracle@server_name:/u01/app/dbvisit/standby/ [DB_name] ./dbvctl -d DDC_name
=============================================================
Dbvisit Standby Database Technology (8.0.20_0_g7e6bd51b) (pid 50909)
dbvctl started on server_name: Mon Oct 28 16:53:05 2019
=============================================================
 
 
>>> Applying Log file(s) from from primary_server to DB_name on standby_server:
 
 
Next SCN required for recovery 3328390017 generated at 2019-10-28:11:57:42 +01:00.
Next log(s) required for recovery:
thread 1 sequence 77553
>>> Searching for new archive logs under /u03/app/oracle/dbvisit_arch/DB_name_SITE2... done
thread 1 sequence 77553 (1_77553_973158276.arc)
thread 1 sequence 77554 (1_77554_973158276.arc)
thread 1 sequence 77555 (1_77555_973158276.arc)
thread 1 sequence 77556 (1_77556_973158276.arc)
thread 1 sequence 77557 (1_77557_973158276.arc)
thread 1 sequence 77558 (1_77558_973158276.arc)
thread 1 sequence 77559 (1_77559_973158276.arc)
thread 1 sequence 77560 (1_77560_973158276.arc)
thread 1 sequence 77561 (1_77561_973158276.arc)
thread 1 sequence 77562 (1_77562_973158276.arc)
thread 1 sequence 77563 (1_77563_973158276.arc)
thread 1 sequence 77564 (1_77564_973158276.arc)
thread 1 sequence 77565 (1_77565_973158276.arc)
thread 1 sequence 77566 (1_77566_973158276.arc)
thread 1 sequence 77567 (1_77567_973158276.arc)
thread 1 sequence 77568 (1_77568_973158276.arc)
thread 1 sequence 77569 (1_77569_973158276.arc)
thread 1 sequence 77570 (1_77570_973158276.arc)
thread 1 sequence 77571 (1_77571_973158276.arc)
thread 1 sequence 77572 (1_77572_973158276.arc)
thread 1 sequence 77573 (1_77573_973158276.arc)
thread 1 sequence 77574 (1_77574_973158276.arc)
>>> Catalog archives... done
>>> Recovering database... done
Last applied log(s):
thread 1 sequence 77574
 
Next SCN required for recovery 3331579974 generated at 2019-10-28:16:50:18 +01:00.
Next required log thread sequence
 
>>> Dbvisit Archive Management Module (AMM)
 
Config: number of archives to keep = 0
Config: number of days to keep archives = 7
Config: diskspace full threshold = 80%
==========
 
Processing /u03/app/oracle/dbvisit_arch/DB_name_SITE2...
Archive log dir: /u03/app/oracle/dbvisit_arch/DB_name_SITE2
Total number of archive files : 1025
Number of archive logs deleted = 8
Current Disk percent full : 51%
 
=============================================================
dbvctl ended on server_name: Mon Oct 28 17:10:36 2019
=============================================================

Cet article dbvisit dbvctl process is terminating abnormally with Error Code: 2044 est apparu en premier sur Blog dbi services.

ODA hang/crash due to software raid-check

Fri, 2019-11-29 02:33

Oracle Database Appliance (ODA) is by default configured with software raid for Operating System and Oracle Database software file system (2 internal SSD disks). 2 raid devices are configured : md0 and md1.ODA are configured to run raid-check every Sunday at 1am.

Analysing the problem

In case the ODA is having some load during raid-check, it can happen that the server freezes. Only IP layer seems to still be alive : server is replying to the ping command, but ssh layer is not available any more.
Nothing can be done with the ODA : no ssh connection, all logs and writes on the server are stuck, ILOM serial connection is impossible.

The only solution is to power cycle the ODA through ILOM.

Problem could be reproduced on customer side by running 2 RMAN database backups and manually executing the raid-check.

In /var/log/messages we can see that server hung doing raid-check on md1 :

Oct 27 01:00:01 ODA02 kernel: [6245829.462343] md: data-check of RAID array md0
Oct 27 01:00:01 ODA02 kernel: [6245829.462347] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Oct 27 01:00:01 ODA02 kernel: [6245829.462349] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
Oct 27 01:00:01 ODA02 kernel: [6245829.462364] md: using 128k window, over a total of 511936k.
Oct 27 01:00:04 ODA02 kernel: [6245832.154108] md: md0: data-check done.
Oct 27 01:01:02 ODA02 kernel: [6245890.375430] md: data-check of RAID array md1
Oct 27 01:01:02 ODA02 kernel: [6245890.375433] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Oct 27 01:01:02 ODA02 kernel: [6245890.375435] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
Oct 27 01:01:02 ODA02 kernel: [6245890.375452] md: using 128k window, over a total of 467694592k.
Oct 27 04:48:07 ODA02 kernel: imklog 5.8.10, log source = /proc/kmsg started. ==> Restart of ODA with ILOM, server freezed on data-check of RAID array md1
Oct 27 04:48:07 ODA02 rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="5788" x-info="http://www.rsyslog.com"] start
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Initializing cgroup subsys cpuset
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Initializing cgroup subsys cpu
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Initializing cgroup subsys cpuacct
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Linux version 4.1.12-124.20.3.el6uek.x86_64 (mockbuild@ca-build84.us.oracle.com) (gcc version 4.9.2 20150212 (Red Hat 4.9.2-6.2.0.3) (GCC) ) #2 SMP Thu Oct 11 17:47:32 PDT 2018
Oct 27 04:48:07 ODA02 kernel: [ 0.000000] Command line: ro root=/dev/mapper/VolGroupSys-LogVolRoot rd_NO_LUKS rd_MD_UUID=424664a7:c29524e9:c7e10fcf:d893414e rd_LVM_LV=VolGroupSys/LogVolRoot rd_LVM_LV=VolGroupSys/LogVolSwap SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM pci=noaer crashkernel=256M@64M loglevel=3 panic=60 transparent_hugepage=never biosdevname=1 ipv6.disable=1 intel_idle.max_cstate=1 nofloppy nomce numa=off console=ttyS0,115200n8 console

Solution Reduce raid check CPU and IO priority

By default raid check is configured with low priority. Setting the priority to idle would ensure to limit the resource used by the check.

Change NICE=low to NICE=idle in /etc/sysconfig/raid-check configuration file.

[root@ODA02 log]# cat /etc/sysconfig/raid-check
#!/bin/bash
#
# Configuration file for /usr/sbin/raid-check
#
# options:
# ENABLED - must be yes in order for the raid check to proceed
# CHECK - can be either check or repair depending on the type of
# operation the user desires. A check operation will scan
# the drives looking for bad sectors and automatically
# repairing only bad sectors. If it finds good sectors that
# contain bad data (meaning that the data in a sector does
# not agree with what the data from another disk indicates
# the data should be, for example the parity block + the other
# data blocks would cause us to think that this data block
# is incorrect), then it does nothing but increments the
# counter in the file /sys/block/$dev/md/mismatch_count.
# This allows the sysadmin to inspect the data in the sector
# and the data that would be produced by rebuilding the
# sector from redundant information and pick the correct
# data to keep. The repair option does the same thing, but
# when it encounters a mismatch in the data, it automatically
# updates the data to be consistent. However, since we really
# don't know whether it's the parity or the data block that's
# correct (or which data block in the case of raid1), it's
# luck of the draw whether or not the user gets the right
# data instead of the bad data. This option is the default
# option for devices not listed in either CHECK_DEVS or
# REPAIR_DEVS.
# CHECK_DEVS - a space delimited list of devs that the user specifically
# wants to run a check operation on.
# REPAIR_DEVS - a space delimited list of devs that the user
# specifically wants to run a repair on.
# SKIP_DEVS - a space delimited list of devs that should be skipped
# NICE - Change the raid check CPU and IO priority in order to make
# the system more responsive during lengthy checks. Valid
# values are high, normal, low, idle.
# MAXCONCURENT - Limit the number of devices to be checked at a time.
# By default all devices will be checked at the same time.
#
# Note: the raid-check script is run by the /etc/cron.d/raid-check cron job.
# Users may modify the frequency and timing at which raid-check is run by
# editing that cron job and their changes will be preserved across updates
# to the mdadm package.
#
# Note2: you can not use symbolic names for the raid devices, such as you
# /dev/md/root. The names used in this file must match the names seen in
# /proc/mdstat and in /sys/block.
 
ENABLED=yes
CHECK=check
NICE=idle
# To check devs /dev/md0 and /dev/md3, use "md0 md3"
CHECK_DEVS=""
REPAIR_DEVS=""
SKIP_DEVS=""
MAXCONCURRENT=

Change raid-check scheduling

Configure raid-check to be run in low activity period. Avoid running raid check during database backup periods for example.

[root@ODA02 ~]# cd /etc/cron.d
 
[root@ODA02 cron.d]# cat raid-check
# Run system wide raid-check once a week on Sunday at 1am by default
0 1 * * Sun root /usr/sbin/raid-check
 
[root@ODA02 cron.d]# vi raid-check
 
[root@ODA02 cron.d]# cat raid-check
# Run system wide raid-check once a week on Sunday at 1am by default
0 19 * * Sat root /usr/sbin/raid-check

Conclusion

These configuration changes could be successfully tested on customer environment. No crash/hang was experienced with NICE parameter set to idle.
As per the oracle documentation, the ODA BIOS default configuration could be change to use hardware raid.
ODA – configuring RAID
The question would be if patching an ODA is still possible afterwards. If you would like to changed this configuration I would strongly recommend you to get Oracle support approval.

Cet article ODA hang/crash due to software raid-check est apparu en premier sur Blog dbi services.

Enabling, disabling, and validating foreign key constraints in PostgreSQL

Thu, 2019-11-28 01:39

Constraints are in important concept in every realtional database system and they guarantee the correctness of your data. While constraints are essentials there are situations when it is required to disable or drop them temporarily. The reason could be performance related because it is faster to validate the constraints at once after a data load. The reason could also be, that you need to load data and you do not know if the data is ordered in such a way that all foreign keys will validate for the time the data is loaded. In such a case it is required to either drop the constraints or to disable them until the data load is done. Validation of the constraints is deferred until all your data is there.

As always lets start with a simple test case, two tables, the second one references the first one:

postgres=# create table t1 ( a int primary key
postgres(#                 , b text
postgres(#                 , c date
postgres(#                 );
CREATE TABLE
postgres=# create table t2 ( a int primary key
postgres(#                 , b int references t1(a)
postgres(#                 , c text
postgres(#                 );
CREATE TABLE

Two rows, for each of them:

postgres=# insert into t1 (a,b,c) values(1,'aa',now());
INSERT 0 1
postgres=# insert into t1 (a,b,c) values(2,'bb',now());
INSERT 0 1
postgres=# insert into t2 (a,b,c) values (1,1,'aa');
INSERT 0 1
postgres=# insert into t2 (a,b,c) values (2,2,'aa');

Currently the two tiny tables look like this:

postgres=# \d t1
                 Table "public.t1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | text    |           |          | 
 c      | date    |           |          | 
Indexes:
    "t1_pkey" PRIMARY KEY, btree (a)
Referenced by:
    TABLE "t2" CONSTRAINT "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a)

postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | integer |           |          | 
 c      | text    |           |          | 
Indexes:
    "t2_pkey" PRIMARY KEY, btree (a)
Foreign-key constraints:
    "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a)

postgres=# 

Lets assume we want to load some data provided by a script. As we do not know the ordering of the data in the script we decide to disable the foreign key constraint on the t2 table and validate it after the load:

postgres=# alter table t2 disable trigger all;
ALTER TABLE

The syntax might look a bit strange but it actually does disable the foreign key and it would have disabled all the foreign keys if there would have been more than one. It becomes more clear when we look at the table again:

postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | integer |           |          | 
 c      | text    |           |          | 
Indexes:
    "t2_pkey" PRIMARY KEY, btree (a)
Foreign-key constraints:
    "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a)
Disabled internal triggers:
    "RI_ConstraintTrigger_c_16460" AFTER INSERT ON t2 FROM t1 NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE FUNCTION "RI_FKey_check_ins"()
    "RI_ConstraintTrigger_c_16461" AFTER UPDATE ON t2 FROM t1 NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE FUNCTION "RI_FKey_check_upd"()

“ALL” means, please also disable the internal triggers that are responsible for verifying the constraints. One restriction of the “ALL” keyword is, that you need to be superuser for doing that. Trying that with a normal user will fail:

postgres=# create user u1 with login password 'u1';
CREATE ROLE
postgres=# \c postgres u1
You are now connected to database "postgres" as user "u1".
postgres=> create table t3 ( a int primary key
postgres(>                 , b text
postgres(>                 , c date
postgres(>                 );
CREATE TABLE
postgres=> create table t4 ( a int primary key
postgres(>                 , b int references t3(a)
postgres(>                 , c text
postgres(>                 );
CREATE TABLE
postgres=> alter table t4 disable trigger all;
ERROR:  permission denied: "RI_ConstraintTrigger_c_16484" is a system trigger
postgres=> 

What you could do as a regular user to do disable the user triggers:

postgres=> alter table t4 disable trigger user;
ALTER TABLE

As I do not have any triggers it of course does not make much sense. Coming back to our initial t1 and t2 tables. As the foreign key currently is disabled we can insert data into the t2 table that would violate the constraint:

postgres=# select * from t1;
 a | b  |     c      
---+----+------------
 1 | aa | 2019-11-27
 2 | bb | 2019-11-27
(2 rows)

postgres=# select * from t2;
 a | b | c  
---+---+----
 1 | 1 | aa
 2 | 2 | aa
(2 rows)

postgres=# insert into t2 (a,b,c) values (3,3,'cc');
INSERT 0 1
postgres=# 

There clearly is no matching parent for this row in the t1 table but the insert succeeds, as the foreign key is disabled. Time to validate the constraint:

postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | integer |           |          | 
 c      | text    |           |          | 
Indexes:
    "t2_pkey" PRIMARY KEY, btree (a)
Foreign-key constraints:
    "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a)
Disabled internal triggers:
    "RI_ConstraintTrigger_c_16460" AFTER INSERT ON t2 FROM t1 NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE FUNCTION "RI_FKey_check_ins"()
    "RI_ConstraintTrigger_c_16461" AFTER UPDATE ON t2 FROM t1 NOT DEFERRABLE INITIALLY IMMEDIATE FOR EACH ROW EXECUTE FUNCTION "RI_FKey_check_upd"()

postgres=# alter table t2 enable trigger all;
ALTER TABLE
postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | integer |           |          | 
 c      | text    |           |          | 
Indexes:
    "t2_pkey" PRIMARY KEY, btree (a)
Foreign-key constraints:
    "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a)

postgres=# alter table t2 validate constraint t2_b_fkey;
ALTER TABLE
postgres=# 

Surprise, surprise, PostgreSQL does not complain about the invalid row. Why is that? If we ask the pg_constraint catalog table the constraint is recorded as validated:

postgres=# select convalidated from pg_constraint where conname = 't2_b_fkey' and conrelid = 't2'::regclass;
 convalidated 
--------------
 t
(1 row)

It is even validated if we disable it once more:

postgres=# alter table t2 disable trigger all;
ALTER TABLE
postgres=# select convalidated from pg_constraint where conname = 't2_b_fkey' and conrelid = 't2'::regclass;
 convalidated 
--------------
 t
(1 row)

That implies that PostgreSQL will not validate the constraint when we enable the internal triggers and PostgreSQL will not validate all the data as long as the status is valid. What we really need to do for getting the constraint validated is to invalidate it before:

postgres=# alter table t2 alter CONSTRAINT t2_b_fkey not valid;
ERROR:  ALTER CONSTRAINT statement constraints cannot be marked NOT VALID

Seems this is not the correct way of doing it. The correct way of doing it is to drop the foreign key and then re-create it with status invalid:

postgres=# alter table t2 drop constraint t2_b_fkey;
ALTER TABLE
postgres=# delete from t2 where a in (3,4);
DELETE 2
postgres=# alter table t2 add constraint t2_b_fkey foreign key (b) references t1(a) not valid;
ALTER TABLE
postgres=# \d t2
                 Table "public.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           | not null | 
 b      | integer |           |          | 
 c      | text    |           |          | 
Indexes:
    "t2_pkey" PRIMARY KEY, btree (a)
Foreign-key constraints:
    "t2_b_fkey" FOREIGN KEY (b) REFERENCES t1(a) NOT VALID

Now we have the desired state and we can insert our data:

postgres=# insert into t2(a,b,c) values (3,3,'cc');
ERROR:  insert or update on table "t2" violates foreign key constraint "t2_b_fkey"
DETAIL:  Key (b)=(3) is not present in table "t1".

Surprise, again. Creating a “not valid” constraint only tells PostgreSQL not to scan the whole table to validate if all the rows are valid. For data inserted or updated the constraint is still checked, and this is why the insert fails.

What options do we have left? The obvious one is this:

  • Drop all the foreign the keys.
  • Load the data.
  • Re-create the foreign keys, but leave them invalid to avoid the costly scan of the tables. Now data will be validated.
  • Validate the constraints when there is less load on the system.

Another possibility would be this:

postgres=# alter table t2 alter constraint t2_b_fkey deferrable;
ALTER TABLE
postgres=# begin;
BEGIN
postgres=# set constraints all deferred;
SET CONSTRAINTS
postgres=# insert into t2 (a,b,c) values (3,3,'cc');
INSERT 0 1
postgres=# insert into t2 (a,b,c) values (4,4,'dd');
INSERT 0 1
postgres=# insert into t1 (a,b,c) values (3,'cc',now());
INSERT 0 1
postgres=# insert into t1 (a,b,c) values (4,'dd',now());
INSERT 0 1
postgres=# commit;
COMMIT

The downside of this is that this only works until the next commit, so you have to do all your work in one transaction. The key point of this post is, that the assumption that following will validate your data is false:

postgres=# alter table t2 disable trigger all;
ALTER TABLE
postgres=# insert into t2 (a,b,c) values (5,5,'ee');
INSERT 0 1
postgres=# alter table t2 enable trigger all;
ALTER TABLE
postgres=# 

This will only validate new data but it does not guarantee that all the rows satisfy the constraint:

postgres=# insert into t2 (a,b,c) values (6,6,'ff');
ERROR:  insert or update on table "t2" violates foreign key constraint "t2_b_fkey"
DETAIL:  Key (b)=(6) is not present in table "t1".
postgres=# select * from t2 where b = 5;
 a | b | c  
---+---+----
 5 | 5 | ee
(1 row)

postgres=# select * from t1 where a = 5;
 a | b | c 
---+---+---
(0 rows)

Finally: There is another way of doing it, but this directly updates the pg_constraint catalog table and this is something you should _not_ do (never update internal tables directly!):

postgres=# delete from t2 where b = 5;
DELETE 1
postgres=# delete from t2 where b = 5;
DELETE 1
postgres=# alter table t2 disable trigger all;
ALTER TABLE
postgres=# insert into t2 values (5,5,'ee');
INSERT 0 1
postgres=# alter table t2 enable trigger all;
ALTER TABLE
postgres=# update pg_constraint set convalidated = false where conname = 't2_b_fkey' and conrelid = 't2'::regclass;
UPDATE 1
postgres=# alter table t2 validate constraint t2_b_fkey;
ERROR:  insert or update on table "t2" violates foreign key constraint "t2_b_fkey"
DETAIL:  Key (b)=(5) is not present in table "t1".
postgres=# 

In this case the constraint will be fully validated as it is recorded as invalid in the catalog.

Conclusion: Do not rely on assumptions, always carefully test your procedures.

Cet article Enabling, disabling, and validating foreign key constraints in PostgreSQL est apparu en premier sur Blog dbi services.

A schema and a user are not the same in PostgreSQL

Wed, 2019-11-27 11:56

When people with an Oracle background attend our PostgreSQL DBA Essentials training there is always a bit of confusion about schemas and users. In Oracle a schema and a user is a one to one relationship and there is no real distinction between a user and a schema. In PostgreSQL the situation is different: All the objects a user is creating are created in a specific schema (or namespace). Other users may or may not have permissions to work with this objects or even to create new objects in a specific schema. Compared to Oracle there is one layer more.

The hierarchy in PostgreSQL is this:

|-------------------------------------------|---|
| PostgreSQL instance                       |   |
|-------------------------------------------| U |
|     Database 1      |     Database 2      | S |
|---------------------|---------------------| E |
| Schema 1 | Schema 2 | Schema 1 | Schema 2 | R |
|----------|----------|----------|----------| S |
| t1,t2,t3 | t1,t2,t3 | t1,t2,t3 | t1,t2,t3 |   |
-------------------------------------------------

What this little ASCII image shall tell you: Users (and roles) in PostgreSQL are global objects and are not defined in a database but on the instance level. Schemas are created by users in a specific database and contain database objects. Where a lot of people get confused with is this:

postgres@centos8pg:/home/postgres/ [pgdev] psql -X postgres
psql (13devel)
Type "help" for help.

postgres=# create table t1 ( a int );
CREATE TABLE
postgres=# 

Nothing in this create table statement references a schema but according to what I just said above all objects must go to a schema. Where did this table go then? Each PostgreSQL database comes with a public schema by default and if you do not explicitly specify a schema the new object will go there. There are several ways of asking PostgreSQL for the schema of a given table but probably the two most used ones are these (the first one is asking a catalog view and the second one is using a psql shortcut)

postgres=# select schemaname from pg_tables where tablename = 't1';
 schemaname 
------------
 public
(1 row)

postgres=# \d t1
                 Table "public.t1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 

Btw: The public schema is a special schema in PostgreSQL and you should either remove it or at least revoke permission from public on the public schema. Check here for more information on that.

So what happens when you drop the public schema and try to create a table afterwards?

postgres=# drop schema public cascade;
NOTICE:  drop cascades to table t1
DROP SCHEMA
postgres=# create table t1 ( a int );
ERROR:  no schema has been selected to create in
LINE 1: create table t1 ( a int );
                     ^
postgres=# 

As we do not have a single schema anymore:

postgres=# \dn
List of schemas
 Name | Owner 
------+-------
(0 rows)

… PostgreSQL has no idea where to put the table. At this point it should already be clear that a schema in PostgreSQL is not the same as a user. We are connected as the “postgres” user, but we do not have a schema to create our objects in. Lets create the first schema and right afterwards the same table as above:

postgres=# create schema my_schema;
CREATE SCHEMA
postgres=# create table t1 ( a int );
ERROR:  no schema has been selected to create in
LINE 1: create table t1 ( a int );
                     ^
postgres=# 

… again PostgreSQL is not able to create the table. The question is: Why did it work when then public schema was there? We did not specify the public schema above but it worked. This is where the search_path comes into the game:

postgres=# show search_path;
   search_path   
-----------------
 "$user", public
(1 row)

postgres=# 

By default the search_path contains you current username and public. As none of these schemas exist right now the create table statement will fail. There are two options to fix that. Either use the fully qualified name:

postgres=# create table my_schema.t1 ( a int );
CREATE TABLE
postgres=# \d my_schema.t1
               Table "my_schema.t1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 

… or adjust the search_path so that your preferred schema comes first:

postgres=# set search_path = 'my_schema',"$user",public;
SET
postgres=# show search_path ;
        search_path         
----------------------------
 my_schema, "$user", public
(1 row)

postgres=# create table t2 ( a int );
CREATE TABLE
postgres=# \d t2
               Table "my_schema.t2"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 

postgres=# 

That all might look a bit strange at the beginning, especially when you are used to Oracle, but it also provides great flexibility:

  • A user can create many different schemas, no need to create separate users
  • A user can grant permission to create objects in one of his schemas to someone else
  • You can logically divide your application
  • (no, there are no synonyms in PostgreSQL)
  • The are default privileges you can use

Cet article A schema and a user are not the same in PostgreSQL est apparu en premier sur Blog dbi services.

DOAG 2019: Cloud or Kubernetes on premise and CI/CD pipeline at the top of (my) interest

Fri, 2019-11-22 11:33

The DOAG 2019 is just over now and if I check the subject of sessions I was attending, I have the feeling the Cloud and Kubernetes on-premise deployments and CI/CD pipelines were the top subjects interesting people or may be just me !!

I started by a devOps session and then followed a lot of Kubernetes or containers related sessions. The three following sessions are those I directly think about when trying to summarize.

“The Pillars of Continuous Delivery” – Lykle Thijssen

A very interesting session about continuous delivery.

“How to create a solid foundation for Continuous Delivery, based on four pillars: Agile, Microservices, Automation and Cloud. Lykle explained how agile development can reduce time to production by short development cycles, how Microservices make the deployment process easier with less dependencies, how automation can improve test, build and deployment and how cloud can help with the necessary infrastructure. It all hangs together and without a solid foundation, there is always the risk of building a house of cards.”

“Containers Demystified” – Jan Karremans and Daniel Westermann

“If you are looking at implementing Dev/Ops, infrastructure as a code, if you want to adopt the Cloud or seeking to go with Microservices, you will find Containers on you path. After the opportunities that virtualization brought, Containers are the next best thing! Also (and perhaps specifically) looking at databases with containers, it brings specific challenges. Where containers infrastructures are built to fail and database rely on persistency, you have challenges.”

Watch My Services: Prometheus in Kubernetes – Thorsten Wussow

“The monitoring of microservices is a special challenge. In the lecture the problem of a monitoring of microservices will be discussed. Furthermore, various products are briefly considered, with which one can still carry out such a monitoring. Then a demo shows how to set up monitoring in a Kubernetes cluster with the help of Prometheus and Grafana and what to look for.”

A Kubernetes cluster was used in the Oracle Infrastructure Cloud to demonstrate the deployment and the configuration of Promotheus and Grafana.
With Thorsten, it always looks simple when following the demonstration. Now it is is time to implement it.

Kubernetes im Vergleich: Google, AWS, Oracle, Microsoft – Michael Schulze & Borys Neselovskyi

“The presentation gives a brief overview of the architecture and construction of Kubernetes. In addition, useful application scenarios are presented in selected use cases. Later in the talk, we will compare existing Kubernetes cloud solutions from leading vendors (Google, AWS, Oracle and Microsoft). Here, a customer scenario is used as the basis for the comparison. The following criteria play a role: installation, maintenance, performance, monitoring and, of course, costs.”

A nice comparison between the different cloud solutions.

There were some other very interesting sessions but I will not list all of them now.

DOAG 2019 is over. See you next year there in Nuremberg

Cet article DOAG 2019: Cloud or Kubernetes on premise and CI/CD pipeline at the top of (my) interest est apparu en premier sur Blog dbi services.

NFS Storage Configuration for Kubernetes

Tue, 2019-11-19 09:08

For one of our important customers, we are working on a major project to migrate critical applications to containers. From the implementation of the Kubernetes architecture to the deployment of applications and the administration of the platform, we are responsible for an important technological stack with new challenges for our team.

One of the challenges, both important and exciting, is the implementation of Kubernetes clusters on bare metal (VM) and its management. We have deployed a Kubernetes cluster in VMs, based on VMWare.

As you know, one of the challenges of containerization is storage management. Do we manage stateless or stateful applications? For stateful applications, the way the data generated by the application is stored is very important.

Therefore, based on our infrastructure, we have 2 possibilities:

Here is a representative diagram of the 2 solutions:

Configuring NFS storage for Kubernetes

The Kubernetes infrastructure is composed of the following:

  • k8s-master
  • k8s-worker1
  • k8s-worker2

In addition, we have an NFS server to store our cluster data. In the next steps, we are going to expose the NFS share as a cluster object. We will create Kubernetes Persistent Volumes and Persistent Volume Claims for our application.

Persistent Volume Creation

Define the persistent volume at the cluster level as following:

[ec2-user@ip-10-3-1-217 ~]$ vi create-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-demo
labels:
app: nfs
type: data
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
volumeMode: Filesystem
nfs:
path: /home/ec2-user/data
server: ec2-3-88-194-14.compute-1.amazonaws.com
persistentVolumeReclaimPolicy: Retain

Create the persistent volume and see the results:

[ec2-user@ip-10-3-1-217 ~]$ kubectl create -f create-pv.yaml
persistentvolume/nfs-demo created
[ec2-user@ip-10-3-1-217 ~]$ kubectl get pv
NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
nfs-demo   10Gi       RWO            Retain           Available                                   7s

Once it’s created we can create a persistent volume claim. A PVC is dedicated to a specific namespace.
First, create the nfs-demo namespace, then the PVC.

[ec2-user@ip-10-3-1-217 ~]$ kubectl create ns nfs-demo
namespace/nfs-demo created
[ec2-user@ip-10-3-1-217 ~]$ vi create-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-demo
  namespace: nfs-demo
  labels:
   app: nfs
spec:
  accessModes:
    - ReadWriteOnce
  resources:
     requests:
       storage: 10Gi
  selector:
    matchLabels:
      app: nfs
      type: data
[ec2-user@ip-10-3-1-217 ~]$ kubectl create -f create-pvc.yaml
persistentvolumeclaim/nfs-demo created

[ec2-user@ip-10-3-1-217 ~]$ kubectl get pvc -n nfs-demo
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nfs-demo Bound nfs-demo 10Gi RWO 3m21s

We can see now that our persistent volume changes its status from “Available” to “Bound”.

[ec2-user@ip-10-3-1-217 ~]$ kubectl get pv
NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM               STORAGECLASS   REASON   AGE
nfs-demo   10Gi       RWO            Retain           Bound    nfs-demo/nfs-demo                           169m

Finally, let’s deploy now our workload which will consume the volume claim and the persistent volume. Whatever the workload API object you are using (Deployment, StatefulSet or DaemonSet) the Persistent Volume Claim is defined within the Pod specification, as follows:

[ec2-user@ip-10-3-1-217 ~]$ vi create-pod.yaml

kind: Pod
[ec2-user@ip-10-3-1-217 ~]$ packet_write_wait: Connection to 18.205.188.55 port 22: Broken pipe
kind: Pod
apiVersion: v1
metadata:
  name: nfs-pod
spec:
  containers:
    - name: nfs-demo
      image: alpine
      volumeMounts:
      - name: nfs-demo
          mountPath: /data/nfs
      command: ["/bin/sh"]
      args: ["-c", "sleep 500000"]
  volumes:
  - name: nfs-demo
    persistentVolumeClaim:
      claimName: nfs-demo
[ec2-user@ip-10-3-1-217 ~]$ kubectl create -f create-pod.yaml
pod/nfs-pod created

[ec2-user@ip-10-3-1-217 ~]$ kubectl get pods -o wide -n nfs-demo
NAME      READY   STATUS    RESTARTS   AGE   IP              NODE                         NOMINATED NODE   READINESS GATES
nfs-pod   1/1     Running   0          9s    192.168.37.68   ip-10-3-0-143.ec2.internal              

Let’s now create an empty file into the container volume mount path and see if it is has been created on the NFS server.

[ec2-user@ip-10-3-1-217 ~]$ kubectl -n nfs-demo exec nfs-pod touch /data/test-nfs.sh

We can see now, in the NFS server that the file has been properly stored.

mehdi@MacBook-Pro: ssh -i "dbi.pem" ec2-user@ec2-3-88-194-14.compute-1.amazonaws.com
Last login: Tue Nov 19 13:35:18 2019 from 62.91.42.92

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
16 package(s) needed for security, out of 27 available
Run "sudo yum update" to apply all updates.

[ec2-user@ip-10-3-0-184 ~]$ ls -lrt data/
total 0
-rw-r--r-- 1 root root 0 Nov 19 13:42 test-nfs.sh

Cet article NFS Storage Configuration for Kubernetes est apparu en premier sur Blog dbi services.

Fun with arrays in PostgreSQL

Mon, 2019-11-18 00:30

As you might already know, PostgreSQL comes with many, many data types. What you might not know is, that you can create arrays over all this data types quite easily. Is that important? Well, as always it depends on your requirements but there are use cases where arrays are great. As always, lets do some simple tests.

The following will create very simple table with one column, which is a one-dimensional array of integers.

postgres=# \d t1
                  Table "public.t1"
 Column |   Type    | Collation | Nullable | Default 
--------+-----------+-----------+----------+---------
 a      | integer[] |           |          | 

To insert data into that table you would either do it like this:

postgres=# insert into t1 (a) values ( '{1,2,3,4,5,6}' );
INSERT 0 1
postgres=# select * from t1;
       a       
---------------
 {1,2,3,4,5,6}
(1 row)

… or you can do it like this as well:

postgres=# insert into t1 (a) values ( ARRAY[1,2,3,4,5,6] );
INSERT 0 1
postgres=# select * from t1;
       a       
---------------
 {1,2,3,4,5,6}
 {1,2,3,4,5,6}
(2 rows)

Notice that I did not specify any size of the array. Although you can do that:

postgres=# create table t2 ( a int[6] );
CREATE TABLE

… the limit is not enforced by PostgreSQL:

postgres=# insert into t2 (a) values ( '{1,2,3,4,5,6,7,8}' );
INSERT 0 1
postgres=# select * from t2;
         a         
-------------------
 {1,2,3,4,5,6,7,8}
(1 row)

PostgreSQL does not limit you to one-dimensional arrays, you can well go ahead and create more dimensions:

postgres=# create table t3 ( a int[], b int[][], c int[][][] );
CREATE TABLE
postgres=# \d t3
                  Table "public.t3"
 Column |   Type    | Collation | Nullable | Default 
--------+-----------+-----------+----------+---------
 a      | integer[] |           |          | 
 b      | integer[] |           |          | 
 c      | integer[] |           |          | 

Although it does look like all of the columns are one-dimensional they are actually not:

postgres=# insert into t3 (a,b,c) values ( '{1,2,3}', '{{1,2,3},{1,2,3}}','{{{1,2,3},{1,2,3},{1,2,3}}}' );
INSERT 0 1
postgres=# select * from t3;
    a    |         b         |              c              
---------+-------------------+-----------------------------
 {1,2,3} | {{1,2,3},{1,2,3}} | {{{1,2,3},{1,2,3},{1,2,3}}}
(1 row)

In reality those array columns are not really one-dimensional, you can create as many dimensions as you like even when you think you created one dimension only:

postgres=# create table t4 ( a int[] );
CREATE TABLE
postgres=# insert into t4 (a) values ( '{1}' );
INSERT 0 1
postgres=# insert into t4 (a) values ( '{1,2}' );
INSERT 0 1
postgres=# insert into t4 (a) values ( '{{1,2},{1,2}}' );
INSERT 0 1
postgres=# insert into t4 (a) values ( '{{{1,2},{1,2},{1,2}}}' );
INSERT 0 1
postgres=# insert into t4 (a) values ( '{{{{1,2},{1,2},{1,2},{1,2}}}}' );
INSERT 0 1
postgres=# select * from t4;
               a               
-------------------------------
 {1}
 {1,2}
 {{1,2},{1,2}}
 {{{1,2},{1,2},{1,2}}}
 {{{{1,2},{1,2},{1,2},{1,2}}}}
(5 rows)

Now that there are some rows: how can we query that? This matches the first two rows of the table:

postgres=# select ctid,* from t4 where a[1] = 1;
 ctid  |   a   
-------+-------
 (0,1) | {1}
 (0,2) | {1,2}
(2 rows)

This matches the second row only:

postgres=# select ctid,* from t4 where a = '{1,2}';
 ctid  |   a   
-------+-------
 (0,2) | {1,2}
(1 row)

This matches row three only:

postgres=# select ctid, * from t4 where a[1:2][1:3] = '{{1,2},{1,2}}';
 ctid  |       a       
-------+---------------
 (0,3) | {{1,2},{1,2}}
(1 row)

You can even index array data types by using a GIN index:

postgres=# create index i1 ON t4 using gin (a);
CREATE INDEX
postgres=# \d t4
                  Table "public.t4"
 Column |   Type    | Collation | Nullable | Default 
--------+-----------+-----------+----------+---------
 a      | integer[] |           |          | 
Indexes:
    "i1" gin (a)

This does not make much sense right now is we do not have sufficient data for PostgreSQL considering the index, but a as soon as we have more data the index will be helpful:

postgres=# insert into t4 select '{{1,2},{1,2}}' from generate_series(1,1000000);
INSERT 0 1000000
postgres=# explain select ctid,* from t4 where a = '{1,2}';
                            QUERY PLAN                            
------------------------------------------------------------------
 Bitmap Heap Scan on t4  (cost=28.00..32.01 rows=1 width=51)
   Recheck Cond: (a = '{1,2}'::integer[])
   ->  Bitmap Index Scan on i1  (cost=0.00..28.00 rows=1 width=0)
         Index Cond: (a = '{1,2}'::integer[])
(4 rows)

In addition to that PostgreSQL comes with many support functions for working with arrays, e.g. to get the length of an array:

postgres=# select array_length(a,1) from t4 limit 2;
 array_length 
--------------
            1
            2

As I mentioned at the beginning of this post you can create arrays of all kinds of data types, not only integers:

postgres=# create table t5 ( a date[], b timestamp[], c text[], d point[], e boolean[] );
CREATE TABLE
postgres=# \d t5
                            Table "public.t5"
 Column |             Type              | Collation | Nullable | Default 
--------+-------------------------------+-----------+----------+---------
 a      | date[]                        |           |          | 
 b      | timestamp without time zone[] |           |          | 
 c      | text[]                        |           |          | 
 d      | point[]                       |           |          | 
 e      | boolean[]                     |           |          | 

Whatever you want. You can even create arrays over user typed types:

postgres=# create type type1 as ( a int, b text );
CREATE TYPE
postgres=# create table t6 ( a type1[] );
CREATE TABLE
postgres=# \d t6
                 Table "public.t6"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | type1[] |           |          | 

Quite powerful.

Cet article Fun with arrays in PostgreSQL est apparu en premier sur Blog dbi services.

A day of conferences with the Swiss Oracle User Group

Sun, 2019-11-17 10:00
Introduction

I’m not that excited with all these events arround Oracle technologies (and beyond) but it’s always a good place to learn new things, and maybe the most important, discover new ways of thinking. And regarding this point, I was not disappointed.

Franck Pachot: serverless and distributed database

Franck talked about scaling out, it means avoid monoliths. Most of the database servers are this kind of monoliths today. And he advises us to think microservices. It’s not so easy regarding the database component, but it could surely simplify the management of different modules through different developper teams. Achieving scaling out is also get rid of these old cluster technologies (think about RAC) and instead of that, adopt the “sharing nothing”: no storage sharing, no network sharing, etc.
It also means the need for db replication, and also scale of the writes: and that point is more complicated. Sharding is a key point for scaling out (put the associated data where the users resides).

I discovered the CAP theorem, a very interesting theory that shows us that there is actually no ultimate solution. You need to choose your priority: Consistancy and Availability, or Availability and Partition Tolerant or Consistency and Partiton Tolerant. Just remind to keep your database infrastructure adapted to your needs, a google-like infrastructure being probably nice but do you really need the same?

Kamran Aghayer: Transition from dba to data engineer

Times are changing. I knew that since several years, but now it’s like an evidence: as a traditional DBA, I will soon be deprecated. Old-school DBA jobs will be replaced by a lot of new jobs: data architect, data engineer, data analyst, data scientist, machine learning engineer, AI engineer, …

Kamran focused on Hadoop ecosystem and Spark especially when he needed to archive data from EXADATA to HADOOP (and explained how HADOOP manage data through HDFS filesystem and datanodes – sort of ASM). He used a dedicated connector, sort of wrapper using external tables. Actually this is also what’s inside the Big Data Appliance from Oracle. This task was out of the scope of a traditional DBA, as a good knowledge of the data was needed. So, traditionnal DBA is dead.

Stefan Oehrli – PDB isolation and security

Since Oracle announced the availability of 3 free PDBs with each container database, the interest for Multitenant increased.

We had an overview of the top 10 security risks, all about privileges, privilege abuse, unauthorized privileges elevation, platform vulnerability, sql injection, etc. If you’re already in the cloud with PAAS or DBAAS, risks are the same.

We had a presentation of several clues for risk mitigation:
– path_prefix: it’s some kind of chroot for the PDB
– PDB_os_credential (still bugs but…): concerns credentials and dbms_scheduler
– lockdown profiles: a tool for restricting database features like queuing, partitioning, Java OS access, altering the database. Restrictions working with inclusion or exclusion.

Paolo Kreth and Thomas Bauman: The role of the DBA in the era of Cloud, Multicloud and Autonomous Database

Already heard today that the classic DBA is soon dead. And now the second bullet. The fact is that Oracle worked hard to improve autonomous features during the last 20 years, and like it was presented, you realize that it’s clearly true. Who cares about extents management now?

But there is still a hope. DBA of tomorrow is starting today. As the DBA role actually sits between infrastructure team and data scientists, there is a way to architect your career. Keep a foot in technical stuff, but become a champion in data analysis and machine learning.

Or focus on development with opensource and cloud. The DBA job can shift, don’t miss this opportunity.

Nikitas Xenakis – MAA with 19c and GoldenGate 19c: a real-world case study

Hey! Finally, the DBA is not dead yet! Some projects still need technical skills and complex architecture. The presented project was driven by dowtime costs, and for some kind of businesses, a serious downtime can kill the company. The customer concerned by this project cannot afford more than 1h of global downtime.

We had an introduction of MAA (standing for Maximum Availability Architecture – see Oracle documentation for that).

You first need to estimate:
– the RPO: how much data you can afford to loose
– the RTO: how quick you’ll be up again
– the performance you expect after the downtime: because it matters

The presented infrastructure was composed of RHEL, RAC with Multitenant (1 PDB only), Acitve Data Guard and GoldenGate. The middleware was not from Oracle but configured to work with Transparent Application Failover.

For sure, you still need several old-school DBA’s to setup and manage this kind of infrastructure.

Luiza Nowak: Error when presenting your data

You can refer to the blog from Elisa USAI for more information.

For me, it was very surprising to discover how a presentation can be boring, confusing, missing the point just because of inappropriate slides. Be precise, be captivating, make use of graphics instead of sentences, make good use of the graphics, if you want your presentation to have the expected impact.

Julian Frey: Database cloning in a multitenant environment

Back to pure DBA stuff. Quick remind of why we need to clone, and what we need to clone (data, metadata, partial data, refreshed data only, anonymised data, etc). And now, always considering GDPR compliance!

Cloning before 12c was mainly done with these well known tools: rman duplicate, datapump, GoldenGate, dblinks, storage cloning, embedded clone.pl script (didn’t heard about this one before).

Starting from 12c, and only if you’re using multitenant, new convenient tools are available for cloning: PDB snapshot copy, snapshot carousel, refreshable copy, …

I discovered that you can duplicate a PDB without actually putting the source PDB in read only mode: you just need to put your source PDB in begin backup mode, copy the files, generate the metadata file and create the database with resetlogs. Nice feature.

You have to know that cloning a PDB is native with multitenant, a database being always a clone of something (at least an empty PDB is created from PDB$seed).

Note that Snapshot copy of a PDB is limited for some kind of filesystems, the most known being nfs and acfs. If you decide to go for multitenant without actually having the option, don’t forget to limit the maximum of PDB in your CDB settings. It’s actually a parameter: max_PDBs. Another interesting feature is the possibily to create a PDB from a source PDB without the data (but tablespace and tables are created).

Finally, and against all odds, datapump is still a great tool for most of the cases. You’d better still consider this tool too.

Conclusion

This was a great event, from great organizers, and if pure Oracle DBA is probably not a job that makes younger people dream, jobs dealing with data are not planned to disappear in the near future.

Cet article A day of conferences with the Swiss Oracle User Group est apparu en premier sur Blog dbi services.

PostgreSQL 12 : Setting Up Streaming Replication

Sat, 2019-11-16 11:29

PostgreSQL 12 was released a few month ago. When actually setting up a replication, there is no longer recovery.conf file in the PGDATA. Indeed all parameters of the recovery.conf should be now in the postgresql.conf file. And in the cluster data directory of the standby server, therre should be a file named standby.signal to trigger the standby mode.
In this blog I am just building a streaming replication between 2 servers to show these changes. The configuration we are using is
Primary server dbi-pg-essentials : 192.168.56.101
Standby server dbi-pg-essentials-2 : 192.168.56.102

The primary server is up and running on dbi-pg-essentials server.

postgres@dbi-pg-essentials:/u02/pgdata/12/PG1/ [PG12] pg12

********* dbi services Ltd. *********
                  STATUS  : OPEN
         ARCHIVE_COMMAND  : test ! -f /u99/pgdata/12/archived_wal/%f && cp %p /u99/pgdata/12/archived_wal/%f
            ARCHIVE_MODE  : on
    EFFECTIVE_CACHE_SIZE  : 4096MB
                   FSYNC  : on
          SHARED_BUFFERS  : 128MB
      SYNCHRONOUS_COMMIT  : on
                WORK_MEM  : 4MB
              IS_STANDBY  : false
*************************************

postgres@dbi-pg-essentials:/u02/pgdata/12/PG1/ [PG12]

step 1 : Prepare the user for the replication on the primay server
For streaming replication, we need a user to read the WAL stream, we can do it with a superuser but it is not required. We will create a user with REPLICATION and LOGIN privileges. Contrary to the SUPERUSER privilege, the REPLICATION privilege gives very high permissions but does not allow to modifiy any data.
Here we will create a user named repliuser

postgres=# create user repliuser with password 'postgres'  replication ;
CREATE ROLE
postgres=#

Step 2 : Prepare the authentication on the primary server
The user used for the replication should be allowed to connect for replication. We need then to adjust the pg_hba.conf file for the two servers.

postgres@dbi-pg-essentials:/u02/pgdata/12/PG1/ [PG12] grep repliuser pg_hba.conf
host    replication     repliuser        192.168.56.101/32        md5
host    replication     repliuser        192.168.56.102/32        md5
postgres@dbi-pg-essentials:/u02/pgdata/12/PG1/ [PG12]

Step 3 : Create a replication slot on the primary server
Replication slots provide an automated way to ensure that the master does not remove WAL segments until they have been received by all standbys, and that the master does not remove rows which could cause a recovery conflict even when the standby is disconnected.

psql (12.1 dbi services build)
Type "help" for help.

postgres=# SELECT * FROM pg_create_physical_replication_slot('pg_slot_1');
 slot_name | lsn
-----------+-----
 pg_slot_1 |
(1 row)

postgres=#

Step 4 : Do a backup of the primary database and restore it on the standby
From the standby server launch the following command

postgres@dbi-pg-essentials-2:/u02/pgdata/12/PG1/ [PG12] pg_basebackup -h 192.168.56.101 -D /u02/pgdata/12/PG1 --wal-method=fetch -U repliuser

Step 5 : set the primary connection info for the streaming on standby side
The host name and port number of the primary, connection user name, and password are specified in the primary_conninfo. Here there is a little change as there is no longer a recovery.conf parameter. The primary_conninfo should now be specified in the postgresql.conf

postgres@dbi-pg-essentials-2:/u02/pgdata/12/PG1/ [PG12] grep primary postgresql.conf
primary_conninfo = 'host=192.168.56.101 port=5432 user=repliuser password=postgres'
primary_slot_name = 'pg_slot_1'                 # replication slot on sending server

Step 6 : Create the standby.signal file on standby server
In the cluster data directory of the standby, create a file standby.signal

postgres@dbi-pg-essentials-2:/u02/pgdata/12/PG1/ [PG12] pwd
/u02/pgdata/12/PG1
postgres@dbi-pg-essentials-2:/u02/pgdata/12/PG1/ [PG12] touch standby.signal

Step 7 : Then start the standby cluster

postgres@dbi-pg-essentials-2:/u02/pgdata/12/PG1/ [PG12] pg_ctl start

If everything is fine, you should fine in the alert log

2019-11-16 17:41:21.552 CET [1590] LOG:  database system is ready to accept read only connections
2019-11-16 17:41:21.612 CET [1596] LOG:  started streaming WAL from primary at 0/5000000 on timeline 1

As confirmed by dbi dmk tool, the master is now streaming to the standby server

********* dbi services Ltd. *********
                  STATUS  : OPEN
         ARCHIVE_COMMAND  : test ! -f /u99/pgdata/12/archived_wal/%f && cp %p /u99/pgdata/12/archived_wal/%f
            ARCHIVE_MODE  : on
    EFFECTIVE_CACHE_SIZE  : 4096MB
                   FSYNC  : on
          SHARED_BUFFERS  : 128MB
      SYNCHRONOUS_COMMIT  : on
                WORK_MEM  : 4MB
              IS_STANDBY  : false
               IS_MASTER  : YES, streaming to 192.168.56.102/32
*************************************

postgres@dbi-pg-essentials:/u02/pgdata/12/PG1/ [PG12]

Cet article PostgreSQL 12 : Setting Up Streaming Replication est apparu en premier sur Blog dbi services.

Elapsed time of Oracle Parallel Executions are not shown correctly in AWR

Fri, 2019-11-15 10:08

As the elapsed time (time it takes for a task from start to end, often called wall-clock time) per execution of parallel queries are not shown correctly in AWR-reports, I thought I setup a testcase to find a way to get an elapsed time closer to reality.

REMARK: To use AWR (Automatic Workload Repository) and ASH (Active Session History) as described in this Blog you need to have the Oracle Diagnostics Pack licensed.

I created a table t5 with 213K blocks:

SQL> select blocks from tabs where table_name='T5';
 
    BLOCKS
----------
    213064

In addition I enabled Linux-IO-throttling with 300 IOs/sec through a cgroup on my device sdb to ensure the parallel-statements take a couple of seconds to run:

[root@19c ~]# CONFIG_BLK_CGROUP=y
[root@19c ~]# CONFIG_BLK_DEV_THROTTLING=y
[root@19c ~]# echo "8:16 300" > /sys/fs/cgroup/blkio/blkio.throttle.read_iops_device

After that I ran my test:

SQL> select sysdate from dual;
 
SYSDATE
-------------------
14.11.2019 14:03:51
 
SQL> exec dbms_workload_repository.create_snapshot;
 
PL/SQL procedure successfully completed.
 
SQL> set timing on
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.63
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.62
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.84
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.73
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.63
SQL> select /*+ parallel(t5 2) full(t5) */ count(*) from t5;
 
  COUNT(*)
----------
  50403840
 
Elapsed: 00:00:05.74
SQL> exec dbms_workload_repository.create_snapshot;
 
PL/SQL procedure successfully completed.

Please consider the elapsed time of about 5.7 seconds per execution.

The AWR-report shows the following in the “SQL ordered by Elapsed time”-section:

        Elapsed                  Elapsed Time
        Time (s)    Executions  per Exec (s)  %Total   %CPU    %IO    SQL Id
---------------- -------------- ------------- ------ ------ ------ -------------
            67.3              6         11.22   94.5   37.4   61.3 04r3647p2g7qu
Module: SQL*Plus
select /*+ parallel(t5 2) full(t5) */ count(*) from t5

I.e. 11.22 seconds in average per execution. However, as we can see above, the average execution time is around 5.7 seconds. The reason for the wrong elapsed time per execution is that the elapsed time for the parallel slaves is summed up to the elapsed time even though the processes worked in parallel. Thanks to the column SQL_EXEC_ID (very useful) we can get the sum of the elapsed times per execution from ASH:

SQL> break on report
SQL> compute avg of secs_db_time on report
SQL> select sql_exec_id, qc_session_id, qc_session_serial#, count(*) secs_db_time from v$active_session_history
  2  where sql_id='04r3647p2g7qu' and sample_time>to_date('14.11.2019 14:03:51','dd.mm.yyyy hh24:mi:ss')
  3  group by sql_exec_id, qc_session_id, qc_session_serial#
  4  order by 1;
 
SQL_EXEC_ID QC_SESSION_ID QC_SESSION_SERIAL# SECS_DB_TIME
----------- ------------- ------------------ ------------
   16777216	      237                  16626           12
   16777217	      237                  16626           12
   16777218	      237                  16626           10
   16777219	      237                  16626           12
   16777220	      237                  16626           10
   16777221	      237                  16626           10
                                             ------------
avg                                                    11
 
6 rows selected.

I.e. the 11 secs correspond to the 11.22 secs in the AWR-report.

How do we get the real elapsed time for the parallel queries? If the queries take a couple of seconds we can get the approximate time from ASH as well by subtracting the sample_time at the beginning from the sample_time at the end of each execution (SQL_EXEC_ID):

SQL> select sql_exec_id, extract (second from (max(sample_time)-min(sample_time))) secs_elapsed 
  2  from v$active_session_history
  3  where sql_id='04r3647p2g7qu'
  4  and sample_time>to_date('14.11.2019 14:03:51','dd.mm.yyyy hh24:mi:ss')
  5  group by sql_exec_id
  6  order by 1;
 
SQL_EXEC_ID SECS_ELAPSED
----------- ------------
   16777216         5.12
   16777217        5.104
   16777218         4.16
   16777219        5.118
   16777220        4.104
   16777221        4.171
 
6 rows selected.

I.e. those numbers reflect the real execution time much better.

REMARK: If the queries take minutes (or hours) to run then you have to extract the minutes (and hours) as well of course. See also the example I have at the end of the Blog.

The info in V$SQL is also not very helpful:

SQL> set lines 200 pages 999
SQL> select child_number, plan_hash_value, elapsed_time/1000000 elapsed_secs, 
  2  executions, px_servers_executions, last_active_time 
  3  from v$sql where sql_id='04r3647p2g7qu';
 
CHILD_NUMBER PLAN_HASH_VALUE ELAPSED_SECS EXECUTIONS PX_SERVERS_EXECUTIONS LAST_ACTIVE_TIME
------------ --------------- ------------ ---------- --------------------- -------------------
           0      2747857355    67.346941          6                    12 14.11.2019 14:05:17

I.e. for the QC we have the column executions > 0 and for the parallel slaves we have px_servers_executions > 0. You may actually get different child cursors for the Query Coordinator and the slaves.

So in theory we should be able to do something like:

SQL> select child_number, (sum(elapsed_time)/sum(executions))/1000000 elapsed_time_per_exec_secs 
  2  from v$sql where sql_id='04r3647p2g7qu' group by child_number;
 
CHILD_NUMBER ELAPSED_TIME_PER_EXEC_SECS
------------ --------------------------
           0                 11.2244902

Here we do see the number from the AWR again.

So in future be careful when checking the elapsed time per execution of statements, which ran with parallel slaves. The number will be too high in AWR or V$SQL. Further analysis to get the real elapsed time per execution would be necessary.

REMARK: As the numbers in AWR do come from e.g. dba_hist_sqlstat, the following query provides “wrong” output for parallel executions as well:

SQL> column begin_interval_time format a32
SQL> column end_interval_time format a32
SQL> select begin_interval_time, end_interval_time, ELAPSED_TIME_DELTA/1000000 elapsed_time_secs, 
  2  (ELAPSED_TIME_DELTA/EXECUTIONS_DELTA)/1000000 elapsed_per_exec_secs
  3  from dba_hist_snapshot snap, dba_hist_sqlstat sql 
  4  where snap.snap_id=sql.snap_id and sql_id='04r3647p2g7qu' 
  5  and snap.BEGIN_INTERVAL_TIME > to_date('14.11.2019 14:03:51','dd.mm.yyyy hh24:mi:ss');
 
BEGIN_INTERVAL_TIME              END_INTERVAL_TIME                ELAPSED_TIME_SECS ELAPSED_PER_EXEC_SECS
-------------------------------- -------------------------------- ----------------- ---------------------
14-NOV-19 02.04.00.176 PM        14-NOV-19 02.05.25.327 PM                67.346941            11.2244902

To take another example I did run a query from Jonathan Lewis from
https://jonathanlewis.wordpress.com/category/oracle/parallel-execution:

SQL> @jonathan
 
19348 rows selected.
 
Elapsed: 00:06:42.11

I.e. 402.11 seconds

AWR shows 500.79 seconds:

        Elapsed                  Elapsed Time
        Time (s)    Executions  per Exec (s)  %Total   %CPU    %IO    SQL Id
---------------- -------------- ------------- ------ ------ ------ -------------
           500.8              1        500.79   97.9   59.6   38.6 44v4ws3nzbnsd
Module: SQL*Plus
select /*+ parallel(t1 2) parallel(t2 2)
 leading(t1 t2) use_hash(t2) swa
p_join_inputs(t2) pq_distribute(t2 hash hash) ca
rdinality(t1,50000) */ t1.owner, t1.name, t1.typ

Let’s check ASH with the query I used above (this time including minutes):

select sql_exec_id, extract (minute from (max(sample_time)-min(sample_time))) minutes_elapsed,
extract (second from (max(sample_time)-min(sample_time))) secs_elapsed 
from v$active_session_history
where sql_id='44v4ws3nzbnsd'
group by sql_exec_id
order by 1;
 
SQL_EXEC_ID MINUTES_ELAPSED SECS_ELAPSED
----------- --------------- ------------
   16777216	              6       40.717

I.e. 06:40.72 which is close to the real elapsed time of 06:42.11

Cet article Elapsed time of Oracle Parallel Executions are not shown correctly in AWR est apparu en premier sur Blog dbi services.

A simple Terraform script to create an AWS EC2 playground

Mon, 2019-11-11 02:39

When I started to write the blog about AWS SSM I quickly realized that I need a way to bring up and destroy my AWS EC2 playground in an easy and reproducible way. There are several options for this: I could have used the AWS command line interface or AWS CloudFormation. Both work pretty well and would have been more than enough for my simple requirements. In the end I decided to use Terraform for one simple reason: With Terraform you can not only provision on AWS but also on Azure, GCP and many others. So, instead of using a tool which is limited to AWS, using a tool which is vendor independent make much more sense.

For the AWS SSM blog I had several requirements I wanted to address:

  • The playground should run in my on VPC so I will not affect any other people doing tests at the same time
  • Because I had Windows and Linux EC2 instances two security groups should get created, one allowing SSH and one allowing RDP into the EC2 machines
  • Both security groups should allow outbound connections to the internet by using an internet gateway
  • Finally two Windows, two Ubuntu, two Red Hat, two SLES and two CentOS instances should get provisioned

Using Terraform all of this is actually quite simple. The first information you’ll need to tell Terraform is the provider you want to use. In my case this is AWS and I am using my “test” AWS profile which is configured for my AWS command line interface, the AWS region I want to use is Frankfurt:

// set the provider to AWS and the AWS region to eu-central-1
provider "aws" {
  profile    = "test"
  region     = "eu-central-1"
}

Because I want to limit direct access to the EC2 instances to my own IP address for security reasons I am defining a local variable I am changing each time I am setting this up from a different network. Input variables could also be used for that but in my case a local variable is just fine:

locals {
  my_ip        = ["37.201.6.8/32"]
}

The first component that actually gets provisioned on AWS is the VPC usinf the aws_vpc resource::

// create the virtual private network
resource "aws_vpc" "dwe-vpc" {
  cidr_block = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support = true

  tags = {
    Name = "dwe-vpc"
  }
}

This defines the IP range I’ll be using, switches the dns parameters to on and gets a tag. As I want my EC2 instances to be able to connect to the internet an internet gateway gets provisioned in the next step, which is attached to the VPC that was created in the step before:

// create the internet gateway
resource "aws_internet_gateway" "dwe-igw" {
  vpc_id = "${aws_vpc.dwe-vpc.id}"

  tags = {
    Name = "dwe-igw"
  }
}

Next we need a subnet, again attached to the VPC:

// create a dedicated subnet
resource "aws_subnet" "dwe-subnet" {
  vpc_id            = "${aws_vpc.dwe-vpc.id}"
  cidr_block        = "10.0.1.0/24"
  availability_zone = "eu-central-1a"

  tags = {
    Name = "dwe-subnet"
  }
}

Routing tables are an important concept in AWS and define how traffic is routed in the VPC. The routing table below enables traffic to the internet through the internet gateway:

// create routing table which points to the internet gateway
resource "aws_route_table" "dwe-route" {
  vpc_id = "${aws_vpc.dwe-vpc.id}"

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = "${aws_internet_gateway.dwe-igw.id}"
  }

  tags = {
    Name = "dwe-igw"
  }
}

Once the routing table is defined it needs to be attached to the subnet:

// associate the routing table with the subnet
resource "aws_route_table_association" "subnet-association" {
  subnet_id      = "${aws_subnet.dwe-subnet.id}"
  route_table_id = "${aws_route_table.dwe-route.id}"
}

For ssh and rdp access to the EC2 instances two security groups get provisioned:

// create a security group for ssh access to the linux systems
resource "aws_security_group" "dwe-sg-ssh" {
  name        = "dwe-sg-ssh"
  description = "Allow SSH inbound traffic"
  vpc_id      = "${aws_vpc.dwe-vpc.id}"

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = local.my_ip
  }

  // allow access to the internet
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "dwe-sg-ssh"
  }
}

// create a security group for rdp access to the windows systems
resource "aws_security_group" "dwe-sg-rdp" {
name        = "dwe-sg-rdp"
vpc_id      = "${aws_vpc.dwe-vpc.id}"
description = "Allow RDP inbound traffic"

  ingress {
    from_port   = 3389
    to_port     = 3389
    protocol    = "tcp"
    cidr_blocks = local.my_ip
  }

  // allow access to the internet
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }


  tags = {
    Name = "dwe-sg-rdp"
  }
}

Both define an inbound rule (ssh and rdp) and an outbound rule for being able to connect the internet. Now, as all the basics are there the EC2 instances itself get provisioned based on the building blocks from above:

// create two Ubuntu instances
resource "aws_instance" "i-ubuntu-linux-prod" {
  ami                         = "ami-0cc0a36f626a4fdf5"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = {
    Name = "i-ubuntu-linux-prod"
  }  
}

resource "aws_instance" "i-ubuntu-linux-test" {
  ami                         = "ami-0cc0a36f626a4fdf5"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-ubuntu-linux-test"
  } 
}

// create two Amazon linux instances
resource "aws_instance" "i-amazon-linux-prod" {
  ami                         = "ami-0f3a43fbf2d3899f7"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-amazon-linux-prod"
  } 
}

resource "aws_instance" "i-amazon-linux-test" {
  ami                         = "ami-0f3a43fbf2d3899f7"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-amazon-linux-test"
  } 
}

// create two Red Hat linux instances
resource "aws_instance" "i-redhat-linux-prod" {
  ami                         = "ami-0badcc5b522737046"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device {
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-redhat-linux-prod"
  } 
}

resource "aws_instance" "i-redhat-linux-test" {
  ami                         = "ami-0badcc5b522737046"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"
  
  root_block_device {     
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = { 
    Name = "i-redhat-linux-test"
  }
}

// create two SUSE linux instances
resource "aws_instance" "i-suse-linux-prod" {
  ami                         = "ami-03b86a97a8f02d44e"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = { 
    Name = "i-suse-linux-prod"
  }
}

resource "aws_instance" "i-suse-linux-test" {
  ami                         = "ami-03b86a97a8f02d44e"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"
  
  root_block_device {     
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = { 
    Name = "i-suse-linux-test"
  }
}

// create two CentOS linux instances
resource "aws_instance" "i-centos-linux-prod" {
  ami                         = "ami-04cf43aca3e6f3de3"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = { 
    Name = "i-centos-linux-prod"
  }
}

resource "aws_instance" "i-centos-linux-test" {
  ami                         = "ami-04cf43aca3e6f3de3"
  instance_type               = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-ssh.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"
  
  root_block_device {     
    volume_size           = "10"
    volume_type           = "standard"
    delete_on_termination = "true"
  }
  
  tags = { 
    Name = "i-centos-linux-test"
  }
}

// create two Windows instances
resource "aws_instance" "i-windows-prod" {
  ami           = "ami-034937fd7f621ba85"
  instance_type = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-rdp.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "30"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = { 
    Name = "i-windows-prod"
  }
}

resource "aws_instance" "i-windows-test" {
  ami           = "ami-034937fd7f621ba85"
  instance_type = "t2.micro"
  key_name                    = "dwe-key-pair"
  vpc_security_group_ids      = ["${aws_security_group.dwe-sg-rdp.id}"]
  subnet_id                   = "${aws_subnet.dwe-subnet.id}"
  associate_public_ip_address = "true"

  root_block_device { 
    volume_size           = "30"
    volume_type           = "standard"
    delete_on_termination = "true"
  }

  tags = {
    Name = "i-windows-test"
  }
}

And that’s it. To check what actually will be done by Terraform there is the “plan” command. As the output is quite long I’ll skip most of it and just present the last few lines:

dwe@dwe:~/Documents/aws/ssm_demo$ terraform plan

...

Plan: 19 to add, 0 to change, 0 to destroy.

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.

Once you are happy with that you can “apply” the execution plan and everything will get provisioned and you confirmed by “yes”:

dwe@dwe:~/Documents/aws/ssm_demo$ terraform apply

Plan: 19 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

Sit back, relax and one or two minutes later your AWS playground is ready:

aws_vpc.dwe-vpc: Creating...
aws_vpc.dwe-vpc: Creation complete after 2s [id=vpc-026cdc481d5365074]
aws_internet_gateway.dwe-igw: Creating...
aws_subnet.dwe-subnet: Creating...
aws_security_group.dwe-sg-ssh: Creating...
aws_security_group.dwe-sg-rdp: Creating...
aws_subnet.dwe-subnet: Creation complete after 0s [id=subnet-028e27fef8df3b963]
aws_internet_gateway.dwe-igw: Creation complete after 0s [id=igw-0656108a04d5ea0a5]
aws_route_table.dwe-route: Creating...
aws_security_group.dwe-sg-rdp: Creation complete after 1s [id=sg-0764508e3a5234393]
aws_route_table.dwe-route: Creation complete after 1s [id=rtb-07691bc54b40af0ae]
aws_route_table_association.subnet-association: Creating...
aws_instance.i-windows-prod: Creating...
aws_instance.i-windows-test: Creating...
aws_security_group.dwe-sg-ssh: Creation complete after 1s [id=sg-053995952f558a4ff]
aws_instance.i-centos-linux-test: Creating...
aws_instance.i-amazon-linux-prod: Creating...
aws_instance.i-amazon-linux-test: Creating...
aws_instance.i-redhat-linux-prod: Creating...
aws_instance.i-centos-linux-prod: Creating...
aws_instance.i-redhat-linux-test: Creating...
aws_instance.i-ubuntu-linux-test: Creating...
aws_route_table_association.subnet-association: Creation complete after 0s [id=rtbassoc-07c7d4282033c4a71]
aws_instance.i-ubuntu-linux-prod: Creating...
aws_instance.i-windows-prod: Still creating... [10s elapsed]
aws_instance.i-windows-test: Still creating... [10s elapsed]
aws_instance.i-centos-linux-test: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-test: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-prod: Still creating... [10s elapsed]
aws_instance.i-redhat-linux-prod: Still creating... [10s elapsed]
aws_instance.i-centos-linux-prod: Still creating... [10s elapsed]
aws_instance.i-redhat-linux-test: Still creating... [10s elapsed]
aws_instance.i-ubuntu-linux-test: Still creating... [10s elapsed]
aws_instance.i-ubuntu-linux-prod: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-test: Creation complete after 13s [id=i-02706717c3440723a]
aws_instance.i-suse-linux-test: Creating...
aws_instance.i-ubuntu-linux-test: Creation complete after 16s [id=i-0d2999aa319a90a3d]
aws_instance.i-centos-linux-test: Creation complete after 16s [id=i-03923fcf9d5881421]
aws_instance.i-suse-linux-prod: Creating...
aws_instance.i-ubuntu-linux-prod: Creation complete after 16s [id=i-00967725bc758f3ef]
aws_instance.i-redhat-linux-test: Creation complete after 16s [id=i-02a705327fb0acb61]
aws_instance.i-windows-prod: Creation complete after 16s [id=i-09c3fcc90491ef2cf]
aws_instance.i-redhat-linux-prod: Creation complete after 16s [id=i-0161726dfe1ed890b]
aws_instance.i-windows-test: Creation complete after 16s [id=i-02f567d11d32444fd]
aws_instance.i-amazon-linux-prod: Still creating... [20s elapsed]
aws_instance.i-centos-linux-prod: Still creating... [20s elapsed]
aws_instance.i-suse-linux-test: Still creating... [10s elapsed]
aws_instance.i-amazon-linux-prod: Creation complete after 23s [id=i-0b799879e77ce8b33]
aws_instance.i-suse-linux-prod: Still creating... [10s elapsed]
aws_instance.i-centos-linux-prod: Still creating... [30s elapsed]
aws_instance.i-centos-linux-prod: Creation complete after 32s [id=i-0482d958849f86483]
aws_instance.i-suse-linux-test: Still creating... [20s elapsed]
aws_instance.i-suse-linux-prod: Still creating... [20s elapsed]
aws_instance.i-suse-linux-test: Still creating... [30s elapsed]
aws_instance.i-suse-linux-test: Creation complete after 32s [id=i-0b35d559853f9f0d6]
aws_instance.i-suse-linux-prod: Still creating... [30s elapsed]
aws_instance.i-suse-linux-prod: Creation complete after 33s [id=i-062df970b894a23da]

Apply complete! Resources: 19 added, 0 changed, 0 destroyed.

Once you’re done with your tests, simply destroy the whole stuff by usinf the “destroy” command:

dwe@dwe:~/Documents/aws/ssm_demo$ terraform destroy
...
Plan: 0 to add, 0 to change, 19 to destroy.

Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes
...
aws_security_group.dwe-sg-rdp: Destroying... [id=sg-0764508e3a5234393]
aws_security_group.dwe-sg-rdp: Destruction complete after 1s
aws_security_group.dwe-sg-ssh: Destruction complete after 1s
aws_subnet.dwe-subnet: Destruction complete after 1s
aws_vpc.dwe-vpc: Destroying... [id=vpc-026cdc481d5365074]
aws_vpc.dwe-vpc: Destruction complete after 0s

Destroy complete! Resources: 19 destroyed.

Quite easy, always reproducible and fast.

Cet article A simple Terraform script to create an AWS EC2 playground est apparu en premier sur Blog dbi services.

odacli create-database error DCS-10802: Insufficient disk space on file system: database

Mon, 2019-11-11 02:27
Introduction

I was reimaging an X6-2M ODA after various patching troubles, and everything was fine. After several databases created, the next ones could no more be created.

DCS-10802: Insufficient disk space on file system: database. Expected free space (MB): {1}, available space (MB): {2}

I’ve spent some time on it, and finally found the cause of the problem. And the solution.

Context

After successfully reimaged an X6-2M ODA with 18.5, and applied the patch for the firmwares, ILOM and disks, I was creating all the databases with odacli with the following commands:


odacli create-database -hm XyXyXyXyXyXy --dbstorage ACFS --characterset WE8MSWIN1252 --databaseUniqueName HPMVRN --dbhomeid '0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb' --dblanguage AMERICAN --dbname HPMVRN --dbshape odb1s --dbterritory AMERICA --no-cdb --no-dbconsole --json
odacli create-database ...

Each database is created with the smallest shape odb1s, as I later fine tune each instance according to my needs.

After the 8th or 9th database created, the next ones ended with a failure:

odacli describe-job -i "2ad9f7e8-331d-4a82-bb7b-c88ddad1cdf8"

Job details
----------------------------------------------------------------
ID: 2ad9f7e8-331d-4a82-bb7b-c88ddad1cdf8
Description: Database service creation with db name: HPMVRN
Status: Failure
Created: November 8, 2019 1:33:23 PM CET
Message: DCS-10802:Insufficient disk space on file system: database. Expected free space (MB): {1}, available space (MB): {2}

Task Name Start Time End Time Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Database Service creation November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:58 PM CET Failure
Database Service creation November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:58 PM CET Failure
Setting up ssh equivalance November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:23 PM CET Success
Creating volume datHPMVRN November 8, 2019 1:33:23 PM CET November 8, 2019 1:33:42 PM CET Success
Creating ACFS filesystem for DATA November 8, 2019 1:33:42 PM CET November 8, 2019 1:33:57 PM CET Success
Database Service creation November 8, 2019 1:33:57 PM CET November 8, 2019 1:33:58 PM CET Failure
Database Creation November 8, 2019 1:33:57 PM CET November 8, 2019 1:33:58 PM CET Failure

Analysis

Error seems obvious: Insufficient disk space. Let’s check the disk space:


df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolRoot
30G 5.6G 23G 20% /
tmpfs 126G 1.1G 125G 1% /dev/shm
/dev/sda1 477M 41M 411M 9% /boot
/dev/mapper/VolGroupSys-LogVolOpt
109G 85G 19G 82% /opt
/dev/mapper/VolGroupSys-LogVolU01
148G 31G 110G 22% /u01
/dev/asm/datdbtest-8 100G 2.2G 98G 3% /u02/app/oracle/oradata/DBTEST
/dev/asm/datgotst-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/GOTST
/dev/asm/datgeval-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/GEVAL
/dev/asm/commonstore-8
5.0G 391M 4.7G 8% /opt/oracle/dcs/commonstore
/dev/asm/datsmval-8
100G 44G 57G 44% /u02/app/oracle/oradata/SMVAL
/dev/asm/datvival-8
100G 34G 67G 34% /u02/app/oracle/oradata/VIVAL
/dev/asm/datvjval-8
100G 56G 45G 56% /u02/app/oracle/oradata/VJVAL
/dev/asm/dump-8 200G 132G 69G 66% /dpdumplocal
/dev/asm/dataoval-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/AOVAL
/dev/asm/datgrtst-8 100G 2.2G 98G 3% /u02/app/oracle/oradata/GRTST
/dev/asm/datgival-8
100G 7.8G 93G 8% /u02/app/oracle/oradata/GIVAL
/dev/asm/reco-329 74G 56G 19G 76% /u03/app/oracle
/dev/asm/datgetst-8 100G 2.2G 98G 3% /u02/app/oracle/oradata/GETST
/dev/asm/datgftst-8
100G 30G 71G 30% /u02/app/oracle/oradata/GFTST
/dev/asm/datgctst-8
100G 2.2G 98G 3% /u02/app/oracle/oradata/GCTST
/dev/asm/dathpmvrn-8 100G 448M 100G 1% /u02/app/oracle/oradata/HPMVRN

No filesystem is full. And create-database managed to create the acfs volume for data successfully. Let’s try to put something in it:


cp /the_path/the_big_file /u02/app/oracle/oradata/HPMVRN/

No problem with this acfs volume.

Let’s try to create the database into ASM:


odacli list-databases | tail -n 1

bed7f9a0-e108-4423-8b2c-d7c33c795e87 HPMVRN Si 12.1.0.2 false Oltp Odb1s Acfs Failed 0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb

odacli delete-database -i "bed7f9a0-e108-4423-8b2c-d7c33c795e87"

odacli create-database -hm XyXyXyXyXyXy --dbstorage ASM --characterset WE8MSWIN1252 --databaseUniqueName HPMVRN --dbhomeid '0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb' --dblanguage AMERICAN --dbname HPMVRN --dbshape odb1s --dbterritory AMERICA --no-cdb --no-dbconsole --json

No problem here, really seems to be related to acfs.

Where does the create-database also need free space? For sure, in the RECOvery area filesystem, created with the very first database:


Filesystem Size Used Avail Use% Mounted on
/dev/asm/reco-329 74G 56G 19G 76% /u03/app/oracle

Let’s create a file in this filesystem:

cp /the_path/the_big_file /u03/app/oracle/

No problem.

Quite strange, and as the database is not yet created, there is no alert_HPMVRN.log to look for the error…

Maybe the RECOvery area filesystem is not big enough for odacli. acfs concept means autoextensible filesystems, but as all my Fast Recovery Areas of all my databases probably won’t fit in the allocated 74GB, odacli may fail. How to extend the acfs RECOvery filesystem? Just put enough dummy files in it, and then remove them.


cd /u03/app/oracle
cp /the_path/the_big_file tmpfile1
cp tmpfile1 tmpfile2
cp tmpfile1 tmpfile3
cp tmpfile1 tmpfile4
cp tmpfile1 tmpfile5
cp tmpfile1 tmpfile6
rm -rf tmp*

df -h /u03/app/oracle
Filesystem Size Used Avail Use% Mounted on
/dev/asm/reco-329 88G 59G 30G 67% /u03/app/oracle

RECOvery filesystem is slightly bigger, let’s retry the database creation:


odacli delete-database -i "985b1d37-6f84-4d64-884f-3a429c195a5d"

odacli create-database -hm XyXyXyXyXyXy --dbstorage ACFS --characterset WE8MSWIN1252 --databaseUniqueName HPMVRN --dbhomeid '0704ef7c-0cb9-4525-8edb-8d70b8f7ddfb' --dblanguage AMERICAN --dbname HPMVRN --dbshape odb1s --dbterritory AMERICA --no-cdb --no-dbconsole --json

odacli describe-job -i "d5022d8b-9ddb-4f93-a84a-36477657794f"

Job details
----------------------------------------------------------------
ID: d5022d8b-9ddb-4f93-a84a-36477657794f
Description: Database service creation with db name: HPMVRN
Status: Success
Created: November 8, 2019 1:47:28 PM CET
Message:

Task Name Start Time End Time Status
---------------------------------------- ----------------------------------- ----------------------------------- ----------
Setting up ssh equivalance November 8, 2019 1:47:30 PM CET November 8, 2019 1:47:40 PM CET Success
Creating volume datHPMVRN November 8, 2019 1:47:40 PM CET November 8, 2019 1:47:59 PM CET Success
Creating ACFS filesystem for DATA November 8, 2019 1:47:59 PM CET November 8, 2019 1:48:14 PM CET Success
Database Service creation November 8, 2019 1:48:14 PM CET November 8, 2019 1:54:38 PM CET Success
Database Creation November 8, 2019 1:48:14 PM CET November 8, 2019 1:53:02 PM CET Success
Change permission for xdb wallet files November 8, 2019 1:53:02 PM CET November 8, 2019 1:53:02 PM CET Success
Place SnapshotCtrlFile in sharedLoc November 8, 2019 1:53:02 PM CET November 8, 2019 1:53:04 PM CET Success
SqlPatch upgrade November 8, 2019 1:54:02 PM CET November 8, 2019 1:54:35 PM CET Success
updating the Database version November 8, 2019 1:54:35 PM CET November 8, 2019 1:54:38 PM CET Success
create Users tablespace November 8, 2019 1:54:38 PM CET November 8, 2019 1:54:40 PM CET Success

This is it. I was then able to create the next databases without any problem.

Final words

Keep in mind that all your Fast Recovery Areas should fit in your dedicated filesystem. To make sure that you will not encounter problems, resize you RECOvery filesystem to the sum of all your target FRA with acfsutil:

acfsutil size 500G –d /dev/asm/reco-329 /u03/app/oracle

Autoextensible filesystem never ensures you it will succeed in case of extension.

Or simply go for ASM instead of acfs on your ODA, if you do not need acfs features like snapshots. ASM is simplier and more efficient because it does not provision disk space for each database like acfs.

Cet article odacli create-database error DCS-10802: Insufficient disk space on file system: database est apparu en premier sur Blog dbi services.

Avoiding patching madness by using AWS SSM

Fri, 2019-11-08 13:15

As we have more and more customers either currently moving to AWS or already moved much, if not even all, of their workloads to AWS, one question pops up quite frequently: Now that we have so many EC2 instances up and running how do we regularly patch them? There are many tools around for patching operating systems, such as Microsoft SSCM, Red Hat Satellite or SUSE Manager just to name a few. There are many other as well but most of them have an important drawback: Either they can only patch Windows systems or they can only patch Linux systems (and even here usually only a subset of the distributions currently in use). This is where AWS System Manager becomes very handy as you can patch most of the commonly used operating system using one single tool. You can even patch your on-prem instances if your network setup includes the on-prem instances as well. In this post I’ll share what we did for one of our customers to remove the pain of either using multiple tools to patch all the operating systems or to manually patch all the systems and then (manually) document what has been done and what was the result.

Managed Instances

Once you start with AWS SSM there are a few key concepts you need to understand. The first one is the so-called “Managed Instance”. For an instance being managed it must fulfill two requirements:

  1. The AWS agent must be deployed on the instance
  2. An IAM role with the correct permission must be assigned to the instance

For the AWS agent it is quite easy if you are deployed any of these because the agent will be there be default:

  • Windows Server 2003-2012 R2 AMIs published in November 2016 or later
  • Amazon Linux
  • Amazon Linux 2
  • Ubuntu Server 16.04
  • Ubuntu Server 18.04

If you are running something else the agent needs to be installed manually as described here for Windows and here for Linux. For on-prem instances it is straight forward as well (Windows here, Linux here) but you need to create Managed Instance Actications as well which is not in the scope of this post.

For the IAM role we’ll first look at what I have currently deployed:

This are two Windows, two Red Hat, two Ubuntu, two Amazon Linux, two CentOS and two SUSE instances, one for Prod and one for Test each. The “Prod” and “Test” naming becomes important later, just ignore it for now. If we take a look at how many managed instances we have right now in System Manager we will see zero, even that the Windows, Ubuntu and Amazon Linux instances have the SSM agent deployed automatically (otherwise we would see a list of managed instances here):

We will use the “Quick Setup” here to setup the IAM role and instance profiles:

For the scope of this post we will go with defaults suggested by AWS:

We want all the instances:


The last screen shows an overview of what is currently happening, basically collecting all the information that will be stored in the AWS SSM inventory. Before you’ll see any instances in “Managed Instances” you will need to wait until the pending inventory actions have been completed:

Once that is done all the managed instances should be there (notice that I manually deployed the ASM SSM agent on the SUSE, CentOS and RedHat instances, otherwise they would not show up here):

Patch baselines

Once all instances are “managed” there is the next concept we need to understand: Patch baselines. Patch baselines basically define which set of patches (classification, severity, …) shall get applied to which product (e.g. Red Hat 7, Red Hat 8, …). As we have six different operating system we need at least six patch baselines. In reality you’ll probably even need twelve because you will have production and test systems. Usually you want to have the same patches applied to production as you have applied to test some days or weeks before and therefore we need two patch baselines per operating system flavor. Patch baselines are listed in SSM under “Patch manager”:

The following screenshots show how I created the patch baseline for the Red Hat test systems:

For the Red Hat production systems the patch baseline is exactly the same but I’ve put an approval delay of 14 days:

Why the delay? Suppose you patch your test systems every first Tuesday a month and then you patch the production systems 14 days later. As you want to have the same patches on production and test, the production patches get an approval delay of 14 days. This is how you can make sure that your systems are on the same level even if production is patched later than test. One point consider is, that this does not apply to the Ubuntu systems as there is no approval delay for Ubuntu patches:

Once all the base lines have been created it looks like this:

The “Patch group” tag

The next key concept that comes into the game are tags. No matter what you well be doing in the cloud, without tags you’re lost. There is just no way around tags if you want to automate processes in the cloud. When it comes the AWS SSM there is the “Patch Group” tag and this tag needs to be assigned to each instance you want to have patched automatically. Let’s assume you want to have the test instances patched every first Saturday each month and the production instances every 3rd Saturday each month ( 6 a.m. in the morning for both). In addition all the Linux hosts should be patched before the Windows hosts, because the Linux hosts run the databases and the Windows hosts run the application servers or the application (of for whatever reason there is a dependency between the hosts and you need to follow a defined order for patching).

What we did to fulfill this requirement is to populate the “Patch Group” tag like this:

  1. All test instances get the value: patch_test_0600_batch_1 or patch_test_0600_batch_2
  2. All production instances get the value: patch_prod_0600_batch_1 or patch_prod_0600_batch_2

The “*batch” values builds the dependency between the Linux and the Windows host. Taking one test Linux and one test Windows as an example the tag would look like this:

I’ve assigned the tags to all the remaining instances accordingly:

  1. i-amazon-linux-test: patch_test_0600_batch_1
  2. i-centos-linux-test: patch_test_0600_batch_1
  3. i-redhat-linux-test: patch_test_0600_batch_1
  4. i-suse-linux-test: patch_test_0600_batch_1
  5. i-ubuntu-linux-test: patch_test_0600_batch_1
  6. i-windows-test: patch_test_0600_batch_2

Same procedure for the production instances but using the prod_* tags of course. Only assigning these tags to the instances is not enough though. We need to assign them to the patch baselines as well so the baseline gets selected when a patch is supposed to run on the systems. Assigning a “Patch Group” to a patch baselines basically links the patch baseline to instances with the same tag value per operating system. If you don’t do that you will always get the default patch baseline for an operating system and you have no chance of applying the same patches to test and prod because you can not have different approval delays.

This needs to be done for all the patch baselines we created above. The patch_test_0600_batch* tags get attached to the baselines for test and the patch_prod_0600_batch* tags get attached to the baselines for production.

Maintenance windows

The last building block are the maintenance windows. They define when an action will executed on which systems. You can not only use them for patching but for many other tasks as well (e.g. running a PowerShell script on Windows hosts, running a shell script on Linux hosts, running an Ansible playbook and much more).

As we want to schedule patching for all the test instances the first Saturday of a month and all production systems the third Saturday of the month we’ll need to two maintenance windows, one for test and for production. We’ll start with the one for test:

For the scheduling you should go for a cron expression:

“6F” means the first Saturday (0 is Sunday) each month. Note that if you skip the “Schedule timezone” timezone you will need to provide UTC time in the cron expression. If you do specify your timezone, AWS is automatically calculating the correct UTC time for you, as you can see once the maintenance window is created:

Now we have the scheduling but the maintenance window has no idea what to run and against what to run it. What to run is specified by adding targets:

You should definitely give a name to the target so you can easily identify the target afterwards:

By specifying the “Patch Group” instance tag with our values for the test systems that should run in the first batch we have all the Linux systems registered with this maintenance window. The second target will be the Windows systems that should be patched once the Linux systems have been done:

All test systems registered:

Now that we defined where to run we need to define what to run and this is specified by the “Run command”:

In our case the document is “AWS-RunPatchBaseline” but it could be any of the documents listed as mentioned earlier:

The task priority is 1 (because we want this to be executed first) and the target is “patch_test_0600_batch_1” (this is why the name for target is important. You will probably not remember the unique string that was generated when you registered the target):

The rate control option allows you to control concurrenry and error thresholds, for the IAM role go with the defaults:

We highly recommend to store the output in a S3 bucket, otherwise the output of the run command will be truncated after 2000 charcters. The parameter should be “Install” (you could also go for “Scan” if you only want to scan for available patches):

Do exactly the same for the patch_test_0600_batch_2 group, with priority 2:

Both run commands are now registered:

And that’s it. Repeat the same for the production machines and all your instances are scheduled for patching. Once the maintenance window executed you can get the logs in the “Run command” secion:



Here you have the link to the log in S3:

Conclusion

When you start using AWS SSM it is a bit hard to understand all the relations between patch baselines, the patch group tag, target, run commands and maintenance windows. Once you got that it is quite easy to schedule patching of your whole infratructure (even on prem) when you run operation systems AWS has patch baselines for. The logs should definitely go to S3 so you have the full output available.

Cet article Avoiding patching madness by using AWS SSM est apparu en premier sur Blog dbi services.

Connection pooling with PgBouncer

Thu, 2019-11-07 09:38

Some of you may know the case: As soon as the number of users grow, the number of resource problems increases. Have you ever thought about using a connection pooler? Too complex, too much administration effort? In this post I like to explain, how the connection pooler can help you with your memory, as well as showing you the simplicity of setup connection pooling with PgBouncer.

Introduction

By default PostgreSQL forks it’s main process to child processes for every new connection. In the course of time this can lead to more and more processes on the server. On one hand, this is pretty cool, because it can provide more stability and a nice view of resource utilization per connection. But if there are many short time connections, the disadvantages will predominate. The more connections you have, the more RAM will be used.
The solution for that problem can be a connection pooler like PgBouncer, an opensource connection pooling middleware espacially designed for Postgres. It will act like a wrapper around the database connections. It has the internals for the connection between the database and the pool, but everything is hidden from the application that connects.

Installation

For the installation of pgBouncer you can decide between two possibilities
1. using yum
2. building from git (https://pgbouncer.github.io/install.html#building-from-git)

To keep it simple, we go on with method 1.

postgres@centos-mini:/u02/pgdata/11/PG1/ [PG1] sudo yum install pgbouncer
Loaded plugins: fastestmirror
Determining fastest mirrors
epel/x86_64/metalink                                                                |  28 kB  00:00:00
 * base: pkg.adfinis-sygroup.ch
 * epel: pkg.adfinis-sygroup.ch
 * extras: pkg.adfinis-sygroup.ch
 * updates: pkg.adfinis-sygroup.ch
base                                                                                | 3.6 kB  00:00:00
epel                                                                                | 5.3 kB  00:00:00
extras                                                                              | 2.9 kB  00:00:00
ius                                                                                 | 1.3 kB  00:00:00
pgdg10                                                                              | 3.6 kB  00:00:00
pgdg11                                                                              | 3.6 kB  00:00:00
updates                                                                             | 2.9 kB  00:00:00
(1/10): base/7/x86_64/group_gz                                                      | 165 kB  00:00:06
(2/10): extras/7/x86_64/primary_db                                                  | 153 kB  00:00:00
(3/10): epel/x86_64/group_gz                                                        |  90 kB  00:00:06
(4/10): epel/x86_64/primary_db                                                      | 6.9 MB  00:00:08
(5/10): epel/x86_64/updateinfo                                                      | 1.0 MB  00:00:08
(6/10): pgdg11/7/x86_64/primary_db                                                  | 337 kB  00:00:01
(8/10): base/7/x86_64/primary_db                                                    | 6.0 MB  00:00:10
(10/10): updates/7/x86_64/primary_db                                                | 2.8 MB  00:00:01
(11/10): ius/x86_64/primary                                                         | 139 kB  00:00:06
(12/10): pgdg10/7/x86_64/primary_db                                                 | 384 kB  00:00:06
ius                                                                                              631/631
Resolving Dependencies
--> Running transaction check
---> Package pgbouncer.x86_64 0:1.12.0-1.rhel7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

===========================================================================================================================================================================================================
 Package                           Arch                                               Version                                                     Repository                                          Size
===========================================================================================================================================================================================================
Installing:
 pgbouncer                         x86_64                                             1.12.0-1.rhel7                                              pgdg10                                             207 k

Transaction Summary
===========================================================================================================================================================================================================
Install  1 Package

Total download size: 207 k
Installed size: 477 k
Is this ok [y/d/N]: y
Downloading packages:
pgbouncer-1.12.0-1.rhel7.x86_64.rpm                                                                                                                                                | 207 kB  00:00:06
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : pgbouncer-1.12.0-1.rhel7.x86_64                                                                                                                                                     1/1
  Verifying  : pgbouncer-1.12.0-1.rhel7.x86_64                                                                                                                                                     1/1

Installed:
  pgbouncer.x86_64 0:1.12.0-1.rhel7

Complete!
Configuration pgbouncer.ini

PgBouncer has one central congfiguration file called pgbouncer.ini, per default it is located under /etc/pgbouncer and it is used to configure the PgBouncer pool.
You can define a lot of parameters in this file, most of them commented out by default and you can start with a very minimal configuration. It includes generic settings as logile, listen_addr, listen_port… as well as connectivity, console access control, admin_users and many, many more. The full list of parameters can be found in the pgbouncer documentation, which is really detailled and gives you a good overview.
Lets have a look to a easy sample of pgbouncer.ini file

cat /etc/pgbouncer/pgbouncer.ini
[databases]
bouncer= host=localhost dbname=bouncer

[pgbouncer]
listen_addr=127.0.0.1
listen_port= 6432
auth_type = md5
auth_file= /etc/pgbouncer/userlist.txt
admin_users=bounce
pool_mode=session
max_client_conn = 100
default_pool_size = 20
logfile = /etc/pgbouncer/log/pgbouncer.log
pidfile = /etc/pgbouncer/pgbouncer.pid

The ini-file has two sections, first the [database] section defines the alias(es) for the database(s). Just start a new line for every database. You can also define user and port etc. and afterwards the [pgbouncer] section, where the pool configuration is done. You can also start with a very minimal configuration.

One of the most important parameters is pool_mode, which defines how a server connection can be reused. 3 modes can be defined:
session: This is the default value: Connections are released back to the pool after disconnection.
transaction: Releases the connection to the pool once a transaction finishes.
statement: After a query finishes, the connection is released back to he pool.

The other parameters in section pgbouncer shortly explained:
listen_addr: List of addresses where to listen for TCP connection.
listen_port: Listening port
admin_users: Users from the auth_file which get access to a special pgbouncer database. This database provides performance-related information about PgBouncer.
max_client_conn: This is the maximum number of client connections allowed. The default value is 100, but there is also a formula to calculate the value
default_pool_size: The number of server connections allowed per user/database pair. The default value is 20.
logfile: This one is self-explaining. The log file location.
pidfile: The location of the PID file.
auth_type and auth_file: This two belong together. Auth_type specifies how to authenticate users (pam|hba|md5) against PgBouncer and auth_file contains the usernames and passwords.

userlist.txt

As already mentioned this file has a very simple structure, username and password. You can either write the password in plain text or the MD5 has of the password. Of course it is not very secure to put the plain text password in here.

cat /etc/pgbouncer/userlist.txt
"bounce" "md51db1c086e81505132d1834e06d07420e"
"postgres" "md53175bce1d3201d16594cebf9d7eb3f9d"
Start PgBouncer

Now all the configuration is done and PgBouncer can be started. It is possible to start PgBouncer on command line and you see the log output directly:

postgres@centos-mini:/etc/pgbouncer/ [PG1] /bin/pgbouncer pgbouncer.ini
2019-11-06 19:40:05.862 CET [13498] LOG kernel file descriptor limit: 1024 (hard: 4096); max_client_conn: 100, max expected fd use: 192
2019-11-06 19:40:05.864 CET [13498] LOG listening on 127.0.0.1:16432
2019-11-06 19:40:05.864 CET [13498] LOG listening on unix:/tmp/.s.PGSQL.16432
2019-11-06 19:40:05.864 CET [13498] LOG process up: PgBouncer 1.12.0, libevent 2.0.21-stable (epoll), adns: libc-2.17, tls: OpenSSL 1.0.2k-fips  26 Jan 2017
2019-11-06 19:41:05.868 CET [13498] LOG stats: 0 xacts/s, 0 queries/s, in 0 B/s, out 0 B/s, xact 0 us, query 0 us, wait 0 us
2019-11-06 19:41:39.325 CET [13498] LOG C-0x18da360: db1/user1@127.0.0.1:58648 login attempt: db=db1 user=user1 tls=no
2019-11-06 19:41:39.326 CET [13498] LOG C-0x18da360: db1/user1@127.0.0.1:58648 closing because: client unexpected eof (age=0s)
2019-11-06 19:41:47.577 CET [13498] LOG C-0x18da360: db1/user1@127.0.0.1:58652 login attempt: db=db1 user=user1 tls=no
2019-11-06 19:41:47.579 CET [13498] LOG S-0x18e0c30: db1/user1@[::1]:5432 new connection to server (from [::1]:37654)
2019-11-06 19:42:05.869 CET [13498] LOG stats: 0 xacts/s, 0 queries/s, in 0 B/s, out 0 B/s, xact 0 us, query 0 us, wait 157 us
2019-11-06 19:43:05.872 CET [13498] LOG stats: 0 xacts/s, 0 queries/s, in 0 B/s, out 2 B/s, xact 1522 us, query 1522 us, wait 0 us

Furthermore it is possible to create a service which starts automatically in the background after every reboot:

cat /etc/systemd/system/pgbouncer.service
[Unit]
Description=A lightweight connection pooler for PostgreSQL
After=syslog.target
After=network.target

[Service]
Type=simple

User=postgres
Group=postgres

# Path to the init file
Environment=BOUNCERCONF=/etc/pgbouncer/pgbouncer.ini

PIDFile=/var/run/pgbouncer/pgbouncer.pid

# Where to send early-startup messages from the server
# This is normally controlled by the global default set by systemd
# StandardOutput=syslog

ExecStart=/bin/pgbouncer ${BOUNCERCONF}
ExecReload=/bin/kill -HUP $MAINPID
KillSignal=SIGINT

# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=300

[Install]
WantedBy=multi-user.target
postgres@centos-mini:/etc/ [PG1] sudo systemctl start pgbouncer
postgres@centos-mini:/etc/ [PG1] sudo systemctl status pgbouncer
● pgbouncer.service - A lightweight connection pooler for PostgreSQL
   Loaded: loaded (/etc/systemd/system/pgbouncer.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-11-07 15:17:09 CET; 4s ago
 Main PID: 17298 (pgbouncer)
   CGroup: /system.slice/pgbouncer.service
           └─17298 /bin/pgbouncer /etc/pgbouncer/pgbouncer.ini

Nov 07 15:17:09 centos-mini systemd[1]: Started A lightweight connection pooler for PostgreSQL.
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG kernel file descriptor limit: 1024 (hard: 4096); max_client_conn: 100, max expected fd use: 172
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG listening on 127.0.0.1:6432
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG listening on unix:/tmp/.s.PGSQL.6432
Nov 07 15:17:09 centos-mini pgbouncer[17298]: 2019-11-07 15:17:09.127 CET [17298] LOG process up: PgBouncer 1.12.0, libevent 2.0.21-stable (epoll), adns: libc-2.17, tls: OpenSSL 1.0.2k-fips  26 Jan 2017
15:17:13 postgres@centos-mini:/etc/ [PG1]

PgBouncer is running now and you can connect to the database using PgBouncer.

postgres@centos-mini:/etc/ [PG1] psql -U bounce -p 6432 -h localhost bouncer
Password for user bounce:
psql (11.4 dbi services build)
Type "help" for help.
bouncer=>
bouncer=>
Monitoring

I already mentioned the admin users before and want to explain them a little bit more detailed now.
PgBouncer allows users with admin rights to connect to the virtual database “pgbouncer”. You can use this database to see who is connecting, how many active connections are in a pool and of course many more. As a good starting point, use “SHOW HELP” as soon as you are connected.

postgres@centos-mini:/etc/ [PG1] psql -U bounce -p 6432 -h localhost pgbouncer
Password for user bounce:
psql (11.4 dbi services build, server 1.12.0/bouncer)
Type "help" for help.
pgbouncer=# SHOW HELP;
NOTICE:  Console usage
DETAIL:
        SHOW HELP|CONFIG|DATABASES|POOLS|CLIENTS|SERVERS|USERS|VERSION
        SHOW FDS|SOCKETS|ACTIVE_SOCKETS|LISTS|MEM
        SHOW DNS_HOSTS|DNS_ZONES
        SHOW STATS|STATS_TOTALS|STATS_AVERAGES|TOTALS
        SET key = arg
        RELOAD
        PAUSE []
        RESUME []
        DISABLE 
        ENABLE 
        RECONNECT []
        KILL 
        SUSPEND
        SHUTDOWN
SHOW
pgbouncer=#
pgbouncer=# SHOW POOLS;
 database  |   user    | cl_active | cl_waiting | sv_active | sv_idle | sv_used | sv_tested | sv_login | maxwait | maxwait_us | pool_mode
-----------+-----------+-----------+------------+-----------+---------+---------+-----------+----------+---------+------------+-----------
 bouncer   | bounce    |         2 |          0 |         0 |       0 |       1 |         0 |        0 |       0 |          0 | session
 pgbouncer | pgbouncer |         1 |          0 |         0 |       0 |       0 |         0 |        0 |       0 |          0 | statement
(2 rows)

pgbouncer=# SHOW CLIENTS;
 type |  user  | database  | state  |   addr    | port  | local_addr | local_port |      connect_time       |      request_time       | wait | wait_us | close_needed |    ptr    | link | remote_pid | tls
------+--------+-----------+--------+-----------+-------+------------+------------+-------------------------+-------------------------+------+---------+--------------+-----------+------+------------+-----
 C    | bounce | bouncer   | active | 127.0.0.1 | 40322 | 127.0.0.1  |       6432 | 2019-11-07 15:24:40 CET | 2019-11-07 15:24:40 CET |    0 |       0 |            0 | 0x1bd9598 |      |          0 |
 C    | bounce | bouncer   | active | 127.0.0.1 | 40332 | 127.0.0.1  |       6432 | 2019-11-07 15:25:12 CET | 2019-11-07 15:25:12 CET |    0 |       0 |            0 | 0x1bd97b0 |      |          0 |
 C    | bounce | pgbouncer | active | 127.0.0.1 | 40314 | 127.0.0.1  |       6432 | 2019-11-07 15:21:50 CET | 2019-11-07 15:25:36 CET |  221 |  440169 |            0 | 0x1bd9380 |      |          0 |
(3 rows)
Conclusion

The above configuration is only a simple example how the configuration can look like. Of course there are many more specifics you can define. PgBouncer is a great tools for connection pooling and can help you to scale down the memory usage of your server. The connections of a pool are always available and in contrast to forking processes, it does not need reach out the database to establish a connection every time. The connections are just there.

Cet article Connection pooling with PgBouncer est apparu en premier sur Blog dbi services.

How to scale up a Patroni cluster

Wed, 2019-11-06 04:43

During the preparation of my presentation for the pgconf.eu I ran into one big issue. I had to stop my cluster to add a new node. That was not the way I wanted to archive this. I want a high availability solution, that can be scaled up without any outage. Due to a little hint during the pgconf.eu I was able to find a solution. In this post I will show the manually scale up, without using a playbook.

Starting position

We start with a 3 node patroni cluster which can be created using this blog post.
Now we want to add a fourth node to the existing etcd and Patroni cluster. In case you also need a playbook to install a forth node, check out my GitHub repository.

Scale up the etcd cluster

This step is only needed, when you want to scale up your etcd cluster as well. To scale up a Patroni cluster it is not necessary to scale up etcd cluster. You can, of course, scale up Patroni without adding more etcd cluster members. But maybe someone also needs to scale up his etcd cluster and searches for a solution. If not, just jump to the next step.

Be sure the etcd and patroni service are not started on the forth node.

postgres@patroni4:/home/postgres/ [PG1] systemctl status etcd
● etcd.service - dbi services etcd service
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
postgres@patroni4:/home/postgres/ [PG1] systemctl status patroni
● patroni.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
postgres@patroni4:/home/postgres/ [PG1]

Make the following adjustments in the etcd.conf of the 4th node.

postgres@patroni4:/home/postgres/ [PG1] cat /u01/app/postgres/local/dmk/etc/etcd.conf
name: patroni4
data-dir: /u02/pgdata/etcd
initial-advertise-peer-urls: http://192.168.22.114:2380
listen-peer-urls: http://192.168.22.114:2380
listen-client-urls: http://192.168.22.114:2379,http://localhost:2379
advertise-client-urls: http://192.168.22.114:2379
initial-cluster-state: 'existing'
initial-cluster: patroni1=http://192.168.22.111:2380,patroni2=http://192.168.22.112:2380,patroni3=http://192.168.22.113:2380,patroni4=http://192.168.22.114:2380

Next add the new etcd member to the existing etcd cluster. You can execute that on every existing member of the cluster.

postgres@patroni1:/home/postgres/ [PG1] etcdctl member add patroni4 http://192.168.22.114:2380
Added member named patroni4 with ID dd9fab8349b3cfc to cluster

ETCD_NAME="patroni4"
ETCD_INITIAL_CLUSTER="patroni4=http://192.168.22.114:2380,patroni1=http://192.168.22.111:2380,patroni2=http://192.168.22.112:2380,patroni3=http://192.168.22.113:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"

Now you can start the etcd service on the 4th node.

postgres@patroni4:/home/postgres/ [PG1] sudo systemctl start etcd
postgres@patroni4:/home/postgres/ [PG1] systemctl status etcd
● etcd.service - dbi services etcd service
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-10-17 16:39:16 CEST; 9s ago
 Main PID: 8239 (etcd)
   CGroup: /system.slice/etcd.service
           └─8239 /u01/app/postgres/local/dmk/bin/etcd --config-file /u01/app/postgres/local/dmk/etc/etcd.conf
postgres@patroni4:/home/postgres/ [PG1]

And after a short check, we can see, that Node 4 is added to the existing cluster

postgres@patroni4:/home/postgres/ [PG1] etcdctl member list
dd9fab8349b3cfc: name=patroni4 peerURLs=http://192.168.22.114:2380 clientURLs=http://192.168.22.114:2379 isLeader=false
16e1dca5ee237693: name=patroni1 peerURLs=http://192.168.22.111:2380 clientURLs=http://192.168.22.111:2379 isLeader=false
28a43bb36c801ed4: name=patroni2 peerURLs=http://192.168.22.112:2380 clientURLs=http://192.168.22.112:2379 isLeader=false
5ba7b55764fad76e: name=patroni3 peerURLs=http://192.168.22.113:2380 clientURLs=http://192.168.22.113:2379 isLeader=true
Scale up Patroni

Scale up the Patroni cluster is also really easy.
Adjust the host entry in the patroni.yml on the new node.

postgres@patroni4:/home/postgres/ [PG1] cat /u01/app/postgres/local/dmk/etc/patroni.yml | grep hosts
  hosts: 192.168.22.111:2379,192.168.22.112:2379,192.168.22.113:2379,192.168.22.114:2379

Afterwards, start the Patroni service.

postgres@patroni4:/home/postgres/ [PG1] sudo systemctl start patroni
postgres@patroni4:/home/postgres/ [PG1] systemctl status patroni
● patroni.service - dbi services patroni service
   Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-10-17 17:03:19 CEST; 5s ago
  Process: 8476 ExecStartPre=/usr/bin/sudo /bin/chown postgres /dev/watchdog (code=exited, status=0/SUCCESS)
  Process: 8468 ExecStartPre=/usr/bin/sudo /sbin/modprobe softdog (code=exited, status=0/SUCCESS)
 Main PID: 8482 (patroni)
   CGroup: /system.slice/patroni.service
           ├─8482 /usr/bin/python2 /u01/app/postgres/local/dmk/bin/patroni /u01/app/postgres/local/dmk/etc/patroni.yml
           ├─8500 /u01/app/postgres/product/11/db_5/bin/postgres -D /u02/pgdata/11/PG1/ --config-file=/u02/pgdata/11/PG1/postgresql.conf --listen_addresses=192.168.22.114 --max_worker_processes=8 --max_locks_per_transact...
           ├─8502 postgres: PG1: logger
           ├─8503 postgres: PG1: startup   waiting for 000000020000000000000006
           ├─8504 postgres: PG1: checkpointer
           ├─8505 postgres: PG1: background writer
           ├─8506 postgres: PG1: stats collector
           ├─8507 postgres: PG1: walreceiver
           └─8513 postgres: PG1: postgres postgres 192.168.22.114(48882) idle

To be sure, everything runs correctly, check the status of the Patroni cluster

postgres@patroni4:/home/postgres/ [PG1] patronictl list
+---------+----------+----------------+--------+---------+----+-----------+
| Cluster |  Member  |      Host      |  Role  |  State  | TL | Lag in MB |
+---------+----------+----------------+--------+---------+----+-----------+
|   PG1   | patroni1 | 192.168.22.111 |        | running |  2 |       0.0 |
|   PG1   | patroni2 | 192.168.22.112 |        | running |  2 |       0.0 |
|   PG1   | patroni3 | 192.168.22.113 | Leader | running |  2 |       0.0 |
|   PG1   | patroni4 | 192.168.22.114 |        | running |  2 |       0.0 |
+---------+----------+----------------+--------+---------+----+-----------+
Conclusion

Using the playbooks had one failure. The entry for host in the patroni.yml is only checking localhost. When starting the fourth node, Patroni is not looking for all the other hosts, it is just looking for its own availability. This works fine in an initial cluster, but not when you want to extended one.
And: Always keep in mind, you need an uneven number of members for an etcd cluster, don’t add only a forth etcd node.

Cet article How to scale up a Patroni cluster est apparu en premier sur Blog dbi services.

Galera Cluster 4 with MariaDB 10.4.8

Wed, 2019-11-06 02:31

Last month, by a new customer I had to install the latest version of the MariaDB server: 10.4.8 to setup a Galera Cluster with 3 master nodes.
The good news was that this version was shipped with the latest version of the Galera Plugin from Codership: Galera Cluster 4.0.
As usual, installation & configuration was quitte easy.

$ sudo yum -y install MariaDB-server
$ sudo yum list installed|grep -i mariadb
MariaDB-client.x86_64 10.4.8-1.el7.centos @mariadb-main
MariaDB-common.x86_64 10.4.8-1.el7.centos @mariadb-main
MariaDB-compat.x86_64 10.4.8-1.el7.centos @mariadb-main
MariaDB-server.x86_64 10.4.8-1.el7.centos @mariadb-main
galera.x86_64 26.4.2-1.rhel7.el7.centos @mariadb-main
New Features:

But now I just want to introduce some of the new interesting and high level features available in this version.

Streaming Replication

In the previous versions, when we had large and long-running write transactions, we had always conflicts during
the Certification Based Replication and because of this, often transactions were aborted and rolled back.
Some of our customers were really suffering and complaining because of this problem and limitation.
Now, when there will be a big transaction, the node who initiated the transaction will not have to wait till the “commit”
but will break it into fragments, will certify it and will replicate it on the other master nodes while still the transaction will be running.

New System Tables

When having a look to the mysql database, we can see 3 new system tables.
They are containing informations that are already in status variables but now they will be persistent:

MariaDB [mysql]> show tables from mysql like 'wsrep_%';
+---------------------------+
| Tables_in_mysql (wsrep_%) |
+---------------------------+
| wsrep_cluster             |
| wsrep_cluster_members     |
| wsrep_streaming_log.      |
+---------------------------+
3 rows in set (0.000 sec)

These 3 new tables should bring to the database administrators a better overview of the current status of the cluster.

MariaDB [mysql]> select * from wsrep_cluster;
+--------------------------------------+---------+------------+------------------+--------------+
| cluster_uuid                         | view_id | view_seqno | protocol_version | capabilities |
+--------------------------------------+---------+------------+------------------+--------------+
| 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | 3       | 1967.      | 4.               | 184703       |
+--------------------------------------+---------+------------+------------------+--------------+
1 row in set (0.000 sec)

cluster_uuid is the uuid of the cluster, corresponding to the status variable: wsrep_cluster_state_uuid
view_id is the number of cluster configuration changes, corresponding to the status variable wsrep_cluster_conf_id
view_seqno is the latest Galera sequence number, corresponding to the status variable: wsrep_last_committed
protocol_version is the MariaDB wsrep patch version, corresponding to the status variable: wsrep_protocol_version
capabilities is the capabilities bitmask provided by the Galera library.

MariaDB [mysql]> select * from wsrep_cluster_members;
+--------------------------------------+--------------------------------------+-----------+-----------------------+
| node_uuid                            | cluster_uuid                         | node_name | node_incoming_address |
+--------------------------------------+--------------------------------------+-----------+-----------------------+
| 6542be69-ffd5-11e9-a2ed-a363df0547d5 | 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | node1     | AUTO                  |
| 6ae6fec5-ffd5-11e9-bb70-da54860baa6d | 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | node2.    | AUTO                  |
| 7054b852-ffd5-11e9-8a45-72b6a9955d28 | 6c41b92b-e0f9-11e8-9924-de3112d0ce21 | node3     | AUTO                  |
+--------------------------------------+--------------------------------------+-----------+-----------------------+
3 rows in set (0.000 sec)

This system table display the current membership of the cluster.
It contains a row for each node and member in the cluster.
node_uuid is the unique identifier of the master node.
cluster_uuid is the unique identifier of the cluster. It must be the same for all members.
node_name is explicit
node_incoming_address stores the IP address and port for client connections.

MariaDB [mysql]> select * from wsrep_streaming_log;
Empty set (0.000 sec)

This system table will contains rows only if there is a transaction which have the “streaming replication” enabled

Synchronization Functions

This new SQL functions can be used in wsrep synchronization operations.
It is possibble to use them to obtain the GTID (Global Transaction ID)
WSREP_LAST_SEEN_GTID(): returns the GTID of the last write transaction observed by the client
WSREP_LAST_WRITTEN_GTID(): returns the GTID of the last write transaction made by the client
WSREP_SYNC_WAIT_UPTO_GTID(): blocks the client until the node applies and commits the given transaction

Conclusion:

These new features and especially the streaming replication, which is really an improvement and a huge boost to large
transaction support, should bring to users and dba’s satisfaction and hopefully a better opinion of the MariaDB Galera cluster.
In another blog, I will try to demonstrate how this streaming replication works.

Cet article Galera Cluster 4 with MariaDB 10.4.8 est apparu en premier sur Blog dbi services.

Create a Kubernetes cluster with Google Kubernetes Engine

Wed, 2019-11-06 01:16

Nowadays the market for cloud providers is very competitive. Large companies are fighting a very hard battle over the services they provide. Each offers a wide range of more or less identical products with specific features for each.

In my point of view, having deployed Kubernetes clusters in several environments (Cloud and On-Premise), I pay particular attention to Google Cloud for its Google Kubernetes Engine offer. The deployment of a Kubernetes cluster is very fast and allows us to have a test/production environment in a few minutes.

Therefore, in this blog post, we will explain how to create a Kubernetes cluster in Google Cloud with some useful additional resources.

Prerequisites

A Google account is needed. You can create one by following the sign-up link: https://cloud.google.com. Otherwise, you can use the free tier account: https://cloud.google.com/free/?hl=en.

Create your project

Go to the cloud portal through the following link: https://console.cloud.google.com/home/dashboard

The first step is the creation of a project. Before creating a resource, you will need to create a project in order to encapsulate all your resources within it. To properly create a project, follow the below steps:

Enter your project name and click on create:

After a few seconds, your project will be created, and you will have access to the home dashboard:

Create your cluster

Once the project is created and ready to use, let’s create now our Kubernetes cluster. Click on the Kubernetes Engine menu and clusters sub-menu to begin the creation process.

Once the Kubernetes Engine API is enabled, we can click on the create cluster button and configured our cluster as needed.

We choose a standard cluster with 3 cluster nodes. You can edit the resources of your cluster according to your needs. For our example, we kept the default configuration provided by the API.

Click on the create button and after a few minutes, your cluster is ready for usage.

Start using your Kubernetes cluster

Google SDK is needed to use your Kubernetes cluster in your favorite client platform. To install Google SDK follow the instructions here:

SDK Cloud is properly installed, we can now initialize our environment by the following steps:

mehdi@MacBook-Pro: gcloud init
Welcome! This command will take you through the configuration of gcloud.
 
Settings from your current configuration [default] are:
core:
  account: mehdi.bada68@gmail.com
  disable_usage_reporting: 'True'
  project: jx-k8s-2511
 
Pick configuration to use:
 [1] Re-initialize this configuration [default] with new settings
 [2] Create a new configuration
Please enter your numeric choice:  1
 
Your current configuration has been set to: [default]
 
You can skip diagnostics next time by using the following flag:
  gcloud init --skip-diagnostics
 
Network diagnostic detects and fixes local network connection issues.
Checking network connection...done.
Reachability Check passed.
Network diagnostic passed (1/1 checks passed).
 
Choose the account you would like to use to perform operations for
this configuration:
 [1] mehdi.bada68@gmail.com
 [2] Log in with a new account
Please enter your numeric choice:  1
 
You are logged in as: [mehdi.bada68@gmail.com].
 
Pick cloud project to use:
 [1] kubernetes-infra-258110
 [2] Create a new project
Please enter numeric choice or text value (must exactly match list
item):  1
 
Your current project has been set to: [kubernetes-infra-258110].
 
Do you want to configure a default Compute Region and Zone? (Y/n)?  Y
 
Which Google Compute Engine zone would you like to use as project
default?
If you do not specify a zone via a command-line flag while working
with Compute Engine resources, the default is assumed.
 
Please enter numeric choice or text value (must exactly match list
item):  8

Login now to gcloud :

mehdi@MacBook-Pro: gcloud auth login
… 
You are now logged in as [mehdi.bada68@gmail.com].
Your current project is [kubernetes-infra-258110].  You can change this setting by running:
  $ gcloud config set project PROJECT_ID

Update your ~./kube/config file with the credentials of the new cluster created before:

mehdi@MacBook-Pro: gcloud container clusters get-credentials standard-cluster-1
Fetching cluster endpoint and auth data.
kubeconfig entry generated for standard-cluster-1.

Your kubectl client is now connected to your remote GKE cluster.

mehdi@MacBook-Pro: kubectl get nodes -o wide
NAME                                                STATUS   ROLES    AGE   VERSION          INTERNAL-IP   EXTERNAL-IP     OS-IMAGE                             KERNEL-VERSION   CONTAINER-RUNTIME
gke-standard-cluster-1-default-pool-1ac453ab-6tj4   Ready       56m   v1.13.11-gke.9   10.128.0.3    34.70.191.147   Container-Optimized OS from Google   4.14.145+        docker://18.9.7
gke-standard-cluster-1-default-pool-1ac453ab-s242   Ready       56m   v1.13.11-gke.9   10.128.0.4    35.188.3.165    Container-Optimized OS from Google   4.14.145+        docker://18.9.7
gke-standard-cluster-1-default-pool-1ac453ab-w0j0   Ready       56m   v1.13.11-gke.9   10.128.0.2    34.70.107.231   Container-Optimized OS from Google   4.14.145+        docker://18.9.7
Deploy Kubernetes Dashboard

After configuring the kubectl client we can start deploying resources on the Kubernetes cluster. One of the most popular resources in Kubernetes is the dashboard. It allows users and admin having a graphical view of all cluster resources.

Download the dashboard deployment locally:

curl -o dashboard.yaml  https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml

Then apply the deployment:

mehdi@MacBook-Pro: kubectl apply -f dashboard.yaml
namespace/kubernetes-dashboard created
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created

Create an admin Service Account and Cluster Role Binding that you can use to securely connect to the dashboard with admin-level permissions:

mehdi@MacBook-Pro: vi admin-sa.yaml 

apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin
  namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin
  namespace: kubernetes-dashboard

mehdi@MacBook-Pro: kubectl apply -f admin-sa.yaml
serviceaccount/admin created
clusterrolebinding.rbac.authorization.k8s.io/admin created

First, retrieve the authentication token for the admin service account, as below:

mehdi@MacBook-Pro: kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin | awk '{print $1}')
Name:         admin-token-dpsl9
Namespace:    kubernetes-dashboard
Labels:       
Annotations:  kubernetes.io/service-account.name: admin
              kubernetes.io/service-account.uid: 888de3dc-ffff-11e9-b5ca-42010a800046

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1119 bytes
namespace:  20 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi10b2tlbi1kcHNsOSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJhZG1pbiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6Ijg4OGRlM2RjLWZmZmYtMTFlOS1iNWNhLTQyMDEwYTgwMDA0NiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbiJ9.DBrfylt1RFDpHEuTy4l0BY-kRwFqm9Tvfne8Vu-IZVghy87vVWtsCatjt2wzCtMjX-I5oB0YAYmio7pTwPV-Njyd_VvbWupqOF7yiYE72ZXri0liLnQN5qbtyOmswsjim0ehG_yQSHaAqp21cQdPXb59ItBLN7q0-dh8wBRyOMAVLttjbmzBb02XxtJlALYg8F4hAkyHjJAzHAyntMylUXyS2gn471WUYFs1usDDpA8uZRU3_K6oyccXa-xqs8kKRB1Ch6n4Cq9TeMKkoUyv0_alEEQvwkp_uQCl2Rddk7bLNnjfDXDPC9LXOT-2xfvUf8COe5dO-rUXemHJlhPUHw

Copy the token value.

Access to the Kubernetes dashboard using the kubectl proxy command line.

mehdi@MacBook-Pro: kubectl proxy
Starting to serve on 127.0.0.1:8001

The dashboard is now available in the following link: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/#/login

Choose the token authentication and paste the value from the previous output.

You have now access to the Kubernetes dashboard and deployed your first Kubernetes resource!

Deploy an Ingress Load Balancer

In order to access your cluster service externally, we need to create an ingress load balancer for our GKE cluster. The ingress load balancer will make HTTP/HTTPS applications accessible publicly through the creation of an external IP address for the cluster.

Before creating the ingress, we need to deploy a test application for our example. Let’s deploy an NGINX server.

mehdi@MacBook-Pro: vi nginx-deployment.yaml

apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  type: NodePort
  ports:
    - port: 80
  selector:
    app: nginx


mehdi@MacBook-Pro: kubectl apply -f nginx-deployment.yaml

deployment.apps/nginx-deployment unchanged
service/nginx created

Create the ingress resource and deploy it as following:

mehdi@MacBook-Pro: vi basic-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: basic-ingress
spec:
rules:
- http:
paths:
- backend:
serviceName: nginx
servicePort: 80

mehdi@MacBook-Pro: kubectl apply -f basic-ingress.yaml
ingress.extensions/basic-ingress created

Verify the status of the ingress:

mehdi@MacBook-Pro: kubectl get ing -o wide
NAME            HOSTS   ADDRESS         PORTS   AGE
basic-ingress   *       34.102.214.94   80      8h

The ingress resources have been properly created. We can see the result directly from the Google Cloud dashboard.

The NGINX service is now available via the Ingress Load Balancer and can be accessed through:

Cet article Create a Kubernetes cluster with Google Kubernetes Engine est apparu en premier sur Blog dbi services.

Handling PostgreSQL installations from packages

Sun, 2019-11-03 13:18
In this blog I will show how to handle a PostgreSQL installation with a customized PGDATA using the packages provided by the PostgreSQL community.

One issue with the packages is the hard coded PGDATA, which will be overwritten in the Servicefile with each update of PostgreSQL. This blog entry based on PostgreSQL 12 with CentOS 7 and CentOS 8.

On a minimal installation in my mind a few things are missing, the net-tools package and nano as editor, I’m a friend of using nano instead of vi.

CentOS 7:

$ yum install net-tools
$ yum install nano

CentOS 8:

$ dnf install net-tools
$ dnf install nano

For using the PostgreSQL repository it is important to exclude PostgreSQL from the CentOS Repository.

By using CentOS 7 you need to edit the CentOS-Base repofile to exclude PostgreSQL from Base and Updates.

$ nano /etc/yum.repos.d/CentOS-Base.repo

# CentOS-Base.repo
#
# The mirror system uses the connecting IP address of the client and the
# update status of each mirror to pick mirrors that are updated to and
# geographically close to the client.  You should use this for CentOS updates
# unless you are manually picking other mirrors.
#
# If the mirrorlist= does not work for you, as a fall back you can try the
# remarked out baseurl= line instead.
#
#

[base]
name=CentOS-$releasever - Base
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os&infra=$infra
#baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
#exclude PostgreSQL from os repository 
exclude=postgresql* 

#released updates
[updates]
name=CentOS-$releasever - Updates
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=updates&infra=$inf$
#baseurl=http://mirror.centos.org/centos/$releasever/updates/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
#exclude PostgreSQL from os repository 
exclude=postgresql*

#additional packages that may be useful
[extras]
name=CentOS-$releasever - Extras
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=extras&infra=$infra
#baseurl=http://mirror.centos.org/centos/$releasever/extras/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7

#additional packages that extend functionality of existing packages
[centosplus]
name=CentOS-$releasever - Plus
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=centosplus&infra=$$
[ Read 46 lines ]

By using CentOS 8 it is just one command to exclude PostgreSQL from the distribution repository:

$ dnf -y module disable postgresql

Add PostgreSQL Repository to CentOS 7, in this example it is ProstgreSQL 12

$ yum install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

And the same for CentOS 8

$ dnf install https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm

Now it is time to install PostgreSQL 12 out of the PostgreSQL repository BUT NO INITDB at the moment.

CentOS 7:

$ yum install postgresql12 postgresql12-server postgresql12-contrib

CentOS 8:

$ dnf install postgresql12 postgresql12-server postgresql12-contrib

Now it is time to create the override file to the PostgreSQL Service file, the steps are identical on CentOS 7 and CentOS 8.

In my example PGDATA is in /pg_data/12/data mounted as own volume.

So edit the postgresql-12.service file with sysctl edit:

$ systemctl edit postgresql-12.service

And add the needed content for your customized PGDATA:

[Service]
Environment=PGDATA=/pg_data/12/data

Save the change, it will create a /etc/systemd/system/postgresql-12.service.d/override.conf file which will be merged with the original service file.

To check the content:

$ cat /etc/systemd/system/postgresql-12.service.d/override.conf
[Service]
Environment=PGDATA=/pg_data/12/data

Reload Systemd

$ systemctl daemon-reload

Hopefully your PGATA is owned by the postgres user if not make sure that it is:

$ chown -R postgres:postgres /pg_data/

Create the PostgreSQL instance as root user:

$ /usr/pgsql-12/bin/postgresql-12-setup initdb
Initializing database ... OK

Here it is:

[root@centos-8-blog /]# cd /pg_data/12/data/
[root@centos-8-blog data]# ls
base          pg_dynshmem    pg_multixact  pg_snapshots  pg_tblspc    pg_xact
global        pg_hba.conf    pg_notify     pg_stat       pg_twophase  postgresql.auto.conf
log           pg_ident.conf  pg_replslot   pg_stat_tmp   PG_VERSION   postgresql.conf
pg_commit_ts  pg_logical     pg_serial     pg_subtrans   pg_wal

From now on PostgreSQL minor updates will be done with yum update on CentOS 7 or dnf update on CentOS 8 in one step, no extra downtime for it.

But be careful, before running yum update or dnf update STOP ALL POSTGRESQL INSTANCES!

This is also working in environments with many instances, you need a service file and an override.conf for each instance, an additional instance needs to be created with initdb -D and not with PostgreSQL-12-setup initdb.

This method is also working with SLES 12.

 

Cet article Handling PostgreSQL installations from packages est apparu en premier sur Blog dbi services.

pg_auto_failover: Setup and installation

Fri, 2019-11-01 02:25

When I attended PGIBZ 2019 earlier this year, I talked with Dimitri about pg_auto_failover and I promised to have a look at it. Well, almost half a year later and after we’ve met again at pgconf.eu it is time to actually do that. You probably already know that citudata was acquired by Microsoft earlier this year and that Microsoft seems to be committed to open source since a few years. pg_auto_failover is one of the projects they contribute back to the PostgreSQL community. This will be a multi-blog series and in this very first post it is all about getting it up and running. In a following post we will then look at failover and switchover scenarios.

As usual, when you need auto failover you need at least three nodes and pg_auto_failover is no exception to that. The following graphic is stolen from the pg_auto_failover github page:

We have one PostgreSQL master, one PostgreSQL replica and in addition a monitoring host. In may case that maps to:

pg-af1.ti.dbi-services.com master 192.168.22.70 pg-af2.ti.dbi-services.com replica 192.168.22.71 pg-af3.ti.dbi-services.com monitor/cluster management 192.168.22.72

All of these nodes run CentOS 8 and I will be going from source code as that gives most flexibility. As pg_auto_failover depends on PostgreSQL (of course) the first step is to install PostgreSQL on all three nodes (PostgreSQL 12 in this setup). If you need further information on how to do that you can e.g. check here. Basically these steps have been executed on all the three nodes (given that the postgres user already exists and sudo is configured):

[postgres@pg-af1 ~]$ sudo dnf install -y gcc openldap-devel python36-devel readline-devel redhat-lsb bison flex perl-ExtUtils-Embed zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel openssh-clients bzip2 net-tools wget unzip sysstat xorg-x11-xauth systemd-devel bash-completion python36 policycoreutils-python-utils make git
[postgres@pg-af1 ~]$ wget https://ftp.postgresql.org/pub/source/v12.0/postgresql-12.0.tar.bz2
[postgres@pg-af1 ~]$ tar -axf postgresql-12.0.tar.bz2
[postgres@pg-af1 ~]$ cd postgresql-12.0
[postgres@pg-af1 postgresql-12.0]$ sudo mkdir -p /u01 /u02
[postgres@pg-af1 postgresql-12.0]$ sudo chown postgres:postgres /u01 /u02
[postgres@pg-af1 postgresql-12.0]$ PGHOME=/u01/app/postgres/product/12/db_0/
[postgres@pg-af1 postgresql-12.0]$ SEGSIZE=2
[postgres@pg-af1 postgresql-12.0]$ BLOCKSIZE=8
[postgres@pg-af1 postgresql-12.0]$ WALSEGSIZE=64
[postgres@pg-af1 postgresql-12.0]$ ./configure --prefix=${PGHOME} \
> --exec-prefix=${PGHOME} \
> --bindir=${PGHOME}/bin \
> --libdir=${PGHOME}/lib \
> --sysconfdir=${PGHOME}/etc \
> --includedir=${PGHOME}/include \
> --datarootdir=${PGHOME}/share \
> --datadir=${PGHOME}/share \
> --with-pgport=5432 \
> --with-perl \
> --with-python \
> --with-openssl \
> --with-pam \
> --with-ldap \
> --with-libxml \
> --with-libxslt \
> --with-segsize=${SEGSIZE} \
> --with-blocksize=${BLOCKSIZE} \
> --with-systemd \
> --with-extra-version=" dbi services build"
[postgres@pg-af1 postgresql-12.0]$ make all
[postgres@pg-af1 postgresql-12.0]$ make install
[postgres@pg-af1 postgresql-12.0]$ cd contrib
[postgres@pg-af1 contrib]$ make install
[postgres@pg-af1 contrib]$ cd ../..
[postgres@pg-af1 ~]$ rm -rf postgresql*

We will go for an installation from source code of pg_auto_failover as well (again, on all three nodes):

postgres@pg-af1:/home/postgres/ [pg120] git clone https://github.com/citusdata/pg_auto_failover.git
postgres@pg-af1:/home/postgres/ [pg120] cd pg_auto_failover/
postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] make
postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] make install
postgres@pg-af1:/home/postgres/pg_auto_failover/ [pg120] cd ..
postgres@pg-af1:/home/postgres/ [pg120] rm -rf pg_auto_failover/

That’s it, quite easy. What I like especially is, that there are no dependencies on python or any other libraries except for PostgreSQL. What the installation gives us is basically pg_autoctl:

postgres@pg-af1:/home/postgres/ [pg120] pg_autoctl --help
pg_autoctl: pg_auto_failover control tools and service
usage: pg_autoctl [ --verbose --quiet ]


Available commands:
pg_autoctl
+ create   Create a pg_auto_failover node, or formation
+ drop     Drop a pg_auto_failover node, or formation
+ config   Manages the pg_autoctl configuration
+ show     Show pg_auto_failover information
+ enable   Enable a feature on a formation
+ disable  Disable a feature on a formation
run      Run the pg_autoctl service (monitor or keeper)
stop     signal the pg_autoctl service for it to stop
reload   signal the pg_autoctl for it to reload its configuration
help     print help message
version  print pg_autoctl version

The first step in setting up the cluster is to initialize the monitoring node:

postgres@pg-af3:/home/postgres/ [pg120] pg_autoctl create --help
pg_autoctl create: Create a pg_auto_failover node, or formation

Available commands:
pg_autoctl create
monitor    Initialize a pg_auto_failover monitor node
postgres   Initialize a pg_auto_failover standalone postgres node
formation  Create a new formation on the pg_auto_failover monitor

postgres@pg-af3:/home/postgres/ [pg120] sudo mkdir -p /u02/pgdata
postgres@pg-af3:/home/postgres/ [pg120] sudo chown postgres:postgres /u02/pgdata
postgres@pg-af3:/home/postgres/ [pg120] unset PGDATABASE
postgres@pg-af3:/home/postgres/ [] pg_autoctl create monitor --pgdata /u02/pgdata/PG12/af
INFO  Initialising a PostgreSQL cluster at "/u02/pgdata/PG12/af"
INFO   /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/PG12/af --options "-p 5432" --options "-h *" --waitstart
INFO  Granting connection privileges on 192.168.22.0/24
INFO  Your pg_auto_failover monitor instance is now ready on port 5432.
INFO  pg_auto_failover monitor is ready at postgres://autoctl_node@pg-af3:5432/pg_auto_failover
INFO  Monitor has been succesfully initialized.

Once that succeeds you’ll a new PostgreSQL instance running and pg_auto_failover PostgreSQL background worker processes:

postgres@pg-af3:/home/postgres/ [af] ps -ef | grep "postgres:"
postgres  5958  5955  0 14:15 ?        00:00:00 postgres: checkpointer
postgres  5959  5955  0 14:15 ?        00:00:00 postgres: background writer
postgres  5960  5955  0 14:15 ?        00:00:00 postgres: walwriter
postgres  5961  5955  0 14:15 ?        00:00:00 postgres: autovacuum launcher
postgres  5962  5955  0 14:15 ?        00:00:00 postgres: stats collector
postgres  5963  5955  0 14:15 ?        00:00:00 postgres: pg_auto_failover monitor
postgres  5964  5955  0 14:15 ?        00:00:00 postgres: logical replication launcher
postgres  5965  5955  0 14:15 ?        00:00:00 postgres: pg_auto_failover monitor worker
postgres  5966  5955  0 14:15 ?        00:00:00 postgres: pg_auto_failover monitor worker

The initialization of the monitor node also created a new database and two roles:

postgres@pg-af3:/home/postgres/ [af] psql postgres
psql (12.0 dbi services build)
Type "help" for help.

postgres=# \l
List of databases
Name       |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
------------------+----------+----------+-------------+-------------+-----------------------
pg_auto_failover | autoctl  | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
postgres         | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
template0        | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
                 |          |          |             |             | postgres=CTc/postgres
template1        | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
                 |          |          |             |             | postgres=CTc/postgres

postgres=# \du
List of roles
Role name   |                         Attributes                         | Member of
--------------+------------------------------------------------------------+-----------
autoctl      |                                                            | {}
autoctl_node |                                                            | {}
postgres     | Superuser, Create role, Create DB, Replication, Bypass RLS | {}

What we got in the new database is the pgautofailover extension:

pg_auto_failover=# \dx
List of installed extensions
Name      | Version |   Schema   |         Description
----------------+---------+------------+------------------------------
pgautofailover | 1.0     | public     | pg_auto_failover
plpgsql        | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

For our management kit to work properly a few PostgreSQL parameters will be set:

pg_auto_failover=# alter system set log_truncate_on_rotation = 'on';
ALTER SYSTEM
pg_auto_failover=# alter system set log_filename = 'postgresql-%a.log';
ALTER SYSTEM
pg_auto_failover=# alter system set log_rotation_age = '1440';
ALTER SYSTEM
pg_auto_failover=# alter system set log_line_prefix = '%m - %l - %p - %h - %u@%d - %x';
ALTER SYSTEM
pg_auto_failover=# alter system set log_directory = 'pg_log';
ALTER SYSTEM
pg_auto_failover=# alter system set log_min_messages = 'WARNING';
ALTER SYSTEM
pg_auto_failover=# alter system set log_autovacuum_min_duration = '60s';
ALTER SYSTEM
pg_auto_failover=# alter system set log_min_error_statement = 'NOTICE';
ALTER SYSTEM
pg_auto_failover=# alter system set log_min_duration_statement = '30s';
ALTER SYSTEM
pg_auto_failover=# alter system set log_checkpoints = 'on';
ALTER SYSTEM
pg_auto_failover=# alter system set log_statement = 'ddl';
ALTER SYSTEM
pg_auto_failover=# alter system set log_lock_waits = 'on';
ALTER SYSTEM
pg_auto_failover=# alter system set log_temp_files = '0';
ALTER SYSTEM
pg_auto_failover=# alter system set log_timezone = 'Europe/Zurich';
ALTER SYSTEM
pg_auto_failover=# alter system set log_connections=on;
ALTER SYSTEM
pg_auto_failover=# alter system set log_disconnections=on;
ALTER SYSTEM
pg_auto_failover=# alter system set log_duration=on;
ALTER SYSTEM
pg_auto_failover=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

What we need for the other nodes is the connection string to the monitoring node:

postgres@pg-af3:/home/postgres/ [af] pg_autoctl show uri
postgres://autoctl_node@pg-af3:5432/pg_auto_failover

Once we have that we can proceed with creating the master instance on the first host:

postgres@pg-af1:/home/postgres/ [pg120] unset PGDATABASE
postgres@pg-af1:/home/postgres/ [] sudo mkdir /u02/pgdata
postgres@pg-af1:/home/postgres/ [] sudo chown postgres:postgres /u02/pgdata
postgres@pg-af1:/home/postgres/ [] pg_autoctl create postgres --pgdata /u02/pgdata/12/PG1 --nodename pg-af1.it.dbi-services.com --monitor postgres://autoctl_node@pg-af3:5432/pg_auto_failover
INFO  Found pg_ctl for PostgreSQL 12.0 at /u01/app/postgres/product/12/db_0/bin/pg_ctl
INFO  Registered node pg-af1.it.dbi-services.com:5432 with id 1 in formation "default", group 0.
INFO  Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/12/PG1/pg_autoctl.init"
INFO  Successfully registered as "single" to the monitor.
INFO  Initialising a PostgreSQL cluster at "/u02/pgdata/12/PG1"
INFO  Postgres is not running, starting postgres
INFO   /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/PG1 --options "-p 5432" --options "-h *" --wait start
INFO  The user "postgres" already exists, skipping.
INFO  CREATE DATABASE postgres;
INFO  The database "postgres" already exists, skipping.
INFO  FSM transition from "init" to "single": Start as a single node
INFO  Initialising postgres as a primary
INFO  Transition complete: current state is now "single"
INFO  Keeper has been succesfully initialized.

Once the master if up bring up the replica on the second node:

postgres@pg-af2:/home/postgres/ [pg120] pg_autoctl create postgres --pgdata /u02/pgdata/12/PG1 --nodename pg-af2.it.dbi-services.com --monitor postgres://autoctl_node@pg-af3:5432/pg_auto_failover
17:11:42 INFO  Registered node pg-af2.it.dbi-services.com:5432 with id 2 in formation "default", group 0.
17:11:42 INFO  Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/u02/pgdata/12/PG1/pg_autoctl.init"
17:11:42 INFO  Successfully registered as "wait_standby" to the monitor.
17:11:42 INFO  FSM transition from "init" to "wait_standby": Start following a primary
17:11:42 INFO  Transition complete: current state is now "wait_standby"
17:11:47 INFO  FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby
17:11:47 INFO  The primary node returned by the monitor is pg-af1.it.dbi-services.com:5432
17:11:47 INFO  Initialising PostgreSQL as a hot standby
17:11:47 INFO  Running /u01/app/postgres/product/12/db_0/bin/pg_basebackup -w -h pg-af1.it.dbi-services.com -p 5432 --pgdata /u02/pgdata/12/backup -U pgautofailover_replicator --write-recovery-conf --max-rate 100M --wal-method=stream --slot pgautofailover_standby ...
17:11:49 INFO  pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
0/23699 kB (0%), 0/1 tablespace (/u02/pgdata/12/backup/backup_label )
136/23699 kB (0%), 0/1 tablespace (/u02/pgdata/12/backup/global/4184  )
23708/23708 kB (100%), 0/1 tablespace (...data/12/backup/global/pg_control)
23708/23708 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed

17:11:49 INFO  Postgres is not running, starting postgres
17:11:49 INFO   /u01/app/postgres/product/12/db_0/bin/pg_ctl --pgdata /u02/pgdata/12/PG1 --options "-p 5432" --options "-h *" --wait start
17:11:50 INFO  PostgreSQL started on port 5432
17:11:50 INFO  Transition complete: current state is now "catchingup"
17:11:50 INFO  Keeper has been succesfully initialized.

The next step is to start the so called keeper process (this is the process which communicates with the montoring node about state changes):

postgres@pg-af1:/home/postgres/ [] pg_autoctl run --pgdata /u02/pgdata/12/PG1
INFO  Managing PostgreSQL installation at "/u02/pgdata/12/PG1"
INFO  The version of extenstion "pgautofailover" is "1.0" on the monitor
INFO  pg_autoctl service is starting
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".
INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", current lsn is "0/0".

To integrate that into systemd:

postgres@pg-af2:/home/postgres/ [PG1] pg_autoctl show systemd
20:28:43 INFO  HINT: to complete a systemd integration, run the following commands:
20:28:43 INFO  pg_autoctl -q show systemd --pgdata "/u02/pgdata/12/PG1" | sudo tee /etc/systemd/system/pgautofailover.service
20:28:43 INFO  sudo systemctl daemon-reload
20:28:43 INFO  sudo systemctl start pgautofailover
[Unit]
Description = pg_auto_failover

[Service]
WorkingDirectory = /u02/pgdata/12/PG1
Environment = 'PGDATA=/u02/pgdata/12/PG1'
User = postgres
ExecStart = /u01/app/postgres/product/12/db_0/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0

[Install]
WantedBy = multi-user.target

postgres@pg-af2:/home/postgres/ [PG1] pg_autoctl -q show systemd --pgdata "/u02/pgdata/12/PG1" | sudo tee /etc/systemd/system/pgautofailover.service
[Unit]
Description = pg_auto_failover

[Service]
WorkingDirectory = /u02/pgdata/12/PG1
Environment = 'PGDATA=/u02/pgdata/12/PG1'
User = postgres
ExecStart = /u01/app/postgres/product/12/db_0/bin/pg_autoctl run
Restart = always
StartLimitBurst = 0

[Install]
WantedBy = multi-user.target

postgres@pg-af2:/home/postgres/ [PG1] systemctl list-unit-files | grep pgauto
pgautofailover.service                      disabled
20:30:57 postgres@pg-af2:/home/postgres/ [PG1] sudo systemctl enable pgautofailover.service
Created symlink /etc/systemd/system/multi-user.target.wants/pgautofailover.service → /etc/systemd/system/pgautofailover.service.

If you are on CentOS/Red Hat 8 you will also need this as otherwise the service will not start:

postgres@pg-af1:/u01/app/postgres/local/dmk/ [PG1] sudo semanage fcontext -a -t bin_t /u01/app/postgres/product/12/db_0/bin/pg_autoctl
postgres@pg-af1:/u01/app/postgres/local/dmk/ [PG1] restorecon -v /u01/app/postgres/product/12/db_0/bin/pg_autoctl

After rebooting all the nodes (to confirm that the systemd service is working as expected) the state of the cluster reports one primary and a secondary/replica as expected:

postgres@pg-af3:/home/postgres/ [af] pg_autoctl show state
Name |   Port | Group |  Node |     Current State |    Assigned State
---------------------------+--------+-------+-------+-------------------+------------------
pg-af1.it.dbi-services.com |   5432 |     0 |     1 |           primary |           primary
pg-af2.it.dbi-services.com |   5432 |     0 |     2 |         secondary |         secondary

The various states are documented here.

Remember: As this is based on PostgreSQL 12 there will be no recovery.conf on the replica. The replication parameters have been added to postgresql.auto.conf automatically:

postgres@pg-af2:/u02/pgdata/12/PG1/ [PG1] cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=pgautofailover_replicator passfile=''/home/postgres/.pgpass'' connect_timeout=5 host=''pg-af1.it.dbi-services.com'' port=5432 sslmode=prefer sslcompression=0 gssencmode=disable target_session_attrs=any'
primary_slot_name = 'pgautofailover_standby'

That’s it for the setup. Really easy and simple, I like it. In the next post we’ll have a look at controlled switch-overs and fail-over scenarios.

Cet article pg_auto_failover: Setup and installation est apparu en premier sur Blog dbi services.

AEM Forms – Certify PDF end-up with NoSuchMethodError on bouncycastle

Thu, 2019-10-31 13:30

As part of an AEM project, we were working on setting up a few actions on PDF files. One of these actions was to Sign & Certify a PDF file. The basic Sign & Certify action provided by AEM is working easily by default but if you look deeper, you might get some surprise. The complexity in this case came from the fact that we absolutely needed the signature to contain a valid Time-Stamp using the Time-Stamp Protocol (TSP) as well as a valid Long-Term Validation (LTV). In this blog, I will talk about one (of the numerous) issue we faced that I believe is related only to AEM on WebLogic.

As I mentioned above, the basic Certify operation is working easily but if you do not take a closer look, it might not be TSP and/or LTV. In our case, using AEM 6.4 SP3 on WebLogic Server 12.2.1.3, we got the Certify operation to work but without TSP & LTV:

Certify PDF - TSP failed & LTV failed

Looking at the AEM Managed Server logs, you can see that the last line is an error message:

####<Aug 28, 2019 12:15:22,278 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '16' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-129013562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000055> <1566994522278> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 12:15:25,025 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-129513562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525025> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 12:15:25,680 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525680> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>
####<Aug 28, 2019 12:15:25,681 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525681> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property LastCacheResetTime has been changed from  to 1555070921173>
####<Aug 28, 2019 12:15:25,681 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525681> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property CacheValidationTime has been changed from 0 to 1555070921058>
####<Aug 28, 2019 12:15:25,684 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994525684> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,130 PM UTC> <Info> <Common> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-130E13562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526130> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000628> <Created "1" resources for pool "IDP_DS", out of which "1" are available and "0" are unavailable.>
####<Aug 28, 2019 12:15:26,141 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526141> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,147 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526147> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,153 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526153> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,158 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526158> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 12:15:26,571 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994526571> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>
####<Aug 28, 2019 12:15:27,835 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '60' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-13A613562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000057> <1566994527835> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 12:15:30,923 PM UTC> <Error> <com.adobe.workflow.AWS> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '67' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-12C213562811050A7F40> <7503b440-54b5-43c7-be22-0f19c434ef4c-00000056> <1566994530923> <[severity-value: 8] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <An exception was thrown with name java.lang.NoSuchMethodError message:org.bouncycastle.asn1.x509.AlgorithmIdentifier.getObjectId()Lorg/bouncycastle/asn1/ASN1ObjectIdentifier; while invoking service SignatureService and operation certify and no fault routes were found to be configured.>

 

At the same time, we also got this kind of messages:

ALC-DSC-003-000: com.adobe.idp.dsc.DSCInvocationException: Invocation error.
            at com.adobe.idp.dsc.component.impl.DefaultPOJOInvokerImpl.invoke(DefaultPOJOInvokerImpl.java:152)
            at com.adobe.idp.dsc.interceptor.impl.InvocationInterceptor.intercept(InvocationInterceptor.java:140)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.DocumentPassivationInterceptor.intercept(DocumentPassivationInterceptor.java:53)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor$1.doInTransaction(TransactionInterceptor.java:74)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.execute(EjbTransactionCMTAdapterBean.java:357)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.doRequired(EjbTransactionCMTAdapterBean.java:274)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.__WL_invoke(Unknown Source)
            at weblogic.ejb.container.internal.SessionLocalMethodInvoker.invoke(SessionLocalMethodInvoker.java:33)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.doRequired(Unknown Source)
            at com.adobe.idp.dsc.transaction.impl.ejb.EjbTransactionProvider.execute(EjbTransactionProvider.java:129)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor.intercept(TransactionInterceptor.java:72)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvocationStrategyInterceptor.intercept(InvocationStrategyInterceptor.java:55)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvalidStateInterceptor.intercept(InvalidStateInterceptor.java:37)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.AuthorizationInterceptor.intercept(AuthorizationInterceptor.java:188)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.JMXInterceptor.intercept(JMXInterceptor.java:48)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.engine.impl.ServiceEngineImpl.invoke(ServiceEngineImpl.java:121)
            at com.adobe.idp.dsc.routing.Router.routeRequest(Router.java:131)
            at com.adobe.idp.dsc.provider.impl.base.AbstractMessageReceiver.routeMessage(AbstractMessageReceiver.java:93)
            at com.adobe.idp.dsc.provider.impl.vm.VMMessageDispatcher.doSend(VMMessageDispatcher.java:225)
            at com.adobe.idp.dsc.provider.impl.base.AbstractMessageDispatcher.send(AbstractMessageDispatcher.java:69)
            at com.adobe.idp.dsc.clientsdk.ServiceClient.invoke(ServiceClient.java:215)
            at com.adobe.workflow.engine.PEUtil.invokeAction(PEUtil.java:893)
            at com.adobe.idp.workflow.dsc.invoker.WorkflowDSCInvoker.transientInvoke(WorkflowDSCInvoker.java:356)
            at com.adobe.idp.workflow.dsc.invoker.WorkflowDSCInvoker.invoke(WorkflowDSCInvoker.java:159)
            at com.adobe.idp.dsc.interceptor.impl.InvocationInterceptor.intercept(InvocationInterceptor.java:140)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.DocumentPassivationInterceptor.intercept(DocumentPassivationInterceptor.java:53)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor$1.doInTransaction(TransactionInterceptor.java:74)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.execute(EjbTransactionCMTAdapterBean.java:357)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapterBean.doRequiresNew(EjbTransactionCMTAdapterBean.java:299)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.__WL_invoke(Unknown Source)
            at weblogic.ejb.container.internal.SessionLocalMethodInvoker.invoke(SessionLocalMethodInvoker.java:33)
            at com.adobe.idp.dsc.transaction.impl.ejb.adapter.EjbTransactionCMTAdapter_yjcxi4_ELOImpl.doRequiresNew(Unknown Source)
            at com.adobe.idp.dsc.transaction.impl.ejb.EjbTransactionProvider.execute(EjbTransactionProvider.java:143)
            at com.adobe.idp.dsc.transaction.interceptor.TransactionInterceptor.intercept(TransactionInterceptor.java:72)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvocationStrategyInterceptor.intercept(InvocationStrategyInterceptor.java:55)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.InvalidStateInterceptor.intercept(InvalidStateInterceptor.java:37)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.AuthorizationInterceptor.intercept(AuthorizationInterceptor.java:188)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.interceptor.impl.JMXInterceptor.intercept(JMXInterceptor.java:48)
            at com.adobe.idp.dsc.interceptor.impl.RequestInterceptorChainImpl.proceed(RequestInterceptorChainImpl.java:60)
            at com.adobe.idp.dsc.engine.impl.ServiceEngineImpl.invoke(ServiceEngineImpl.java:121)
            at com.adobe.idp.dsc.routing.Router.routeRequest(Router.java:131)
            at com.adobe.idp.dsc.provider.impl.base.AbstractMessageReceiver.invoke(AbstractMessageReceiver.java:329)
            at com.adobe.idp.dsc.provider.impl.soap.axis.sdk.SoapSdkEndpoint.invokeCall(SoapSdkEndpoint.java:153)
            at com.adobe.idp.dsc.provider.impl.soap.axis.sdk.SoapSdkEndpoint.invoke(SoapSdkEndpoint.java:91)
            at sun.reflect.GeneratedMethodAccessor621.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.java:397)
            at org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.java:186)
            at org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:323)
            at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
            at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
            at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
            at org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:454)
            at org.apache.axis.server.AxisServer.invoke(AxisServer.java:281)
            at org.apache.axis.transport.http.AxisServlet.doPost(AxisServlet.java:699)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
            at org.apache.axis.transport.http.AxisServletBase.service(AxisServletBase.java:327)
            at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
            at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:286)
            at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:260)
            at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:137)
            at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:350)
            at weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:25)
            at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
            at com.adobe.idp.dsc.provider.impl.soap.axis.InvocationFilter.doFilter(InvocationFilter.java:43)
            at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
            at com.adobe.idp.um.auth.filter.ParameterFilter.doFilter(ParameterFilter.java:105)
            at com.adobe.idp.um.auth.filter.CSRFFilter.invokeNextFilter(CSRFFilter.java:141)
            at com.adobe.idp.um.auth.filter.CSRFFilter.doFilter(CSRFFilter.java:132)
            at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
            at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3706)
            at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3672)
            at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:328)
            at weblogic.security.service.SecurityManager.runAsForUserCode(SecurityManager.java:197)
            at weblogic.servlet.provider.WlsSecurityProvider.runAsForUserCode(WlsSecurityProvider.java:203)
            at weblogic.servlet.provider.WlsSubjectHandle.run(WlsSubjectHandle.java:71)
            at weblogic.servlet.internal.WebAppServletContext.doSecuredExecute(WebAppServletContext.java:2443)
            at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2291)
            at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2269)
            at weblogic.servlet.internal.ServletRequestImpl.runInternal(ServletRequestImpl.java:1705)
            at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1665)
            at weblogic.servlet.provider.ContainerSupportProviderImpl$WlsRequestExecutor.run(ContainerSupportProviderImpl.java:272)
            at weblogic.invocation.ComponentInvocationContextManager._runAs(ComponentInvocationContextManager.java:352)
            at weblogic.invocation.ComponentInvocationContextManager.runAs(ComponentInvocationContextManager.java:337)
            at weblogic.work.LivePartitionUtility.doRunWorkUnderContext(LivePartitionUtility.java:57)
            at weblogic.work.PartitionUtility.runWorkUnderContext(PartitionUtility.java:41)
            at weblogic.work.SelfTuningWorkManagerImpl.runWorkUnderContext(SelfTuningWorkManagerImpl.java:652)
            at weblogic.work.ExecuteThread.execute(ExecuteThread.java:420)
            at weblogic.work.ExecuteThread.run(ExecuteThread.java:360)
Caused by: java.lang.NoSuchMethodError: org.bouncycastle.asn1.x509.AlgorithmIdentifier.getObjectId()Lorg/bouncycastle/asn1/ASN1ObjectIdentifier;
            at com.adobe.livecycle.signatures.pki.timestamp.TimestampInfoBC.matchesMessageImprint(TimestampInfoBC.java:187)
            at com.adobe.livecycle.signatures.pki.timestamp.TimestampToken.validateRequest(TimestampToken.java:430)
            at com.adobe.livecycle.signatures.pki.impl.PKIOperations.createTimestamp(PKIOperations.java:562)
            at com.adobe.livecycle.signatures.service.impl.TimeStampProviderImpl.getTimestampToken(TimeStampProviderImpl.java:85)
            at com.adobe.idp.cryptoprovider.LCPKCS7Signer$1.getActualAttributes(LCPKCS7Signer.java:256)
            at com.adobe.livecycle.signatures.pki.signature.CMSPKCS7Impl.sign(CMSPKCS7Impl.java:702)
            at com.adobe.livecycle.signatures.pki.impl.PKIOperations.sign(PKIOperations.java:345)
            at com.adobe.livecycle.signatures.service.cryptoprovider.DSSPKCS7Signer.signData(DSSPKCS7Signer.java:84)
            at com.adobe.idp.cryptoprovider.LCPKCS7Signer.sign(LCPKCS7Signer.java:123)
            at com.adobe.internal.pdftoolkit.services.digsig.digsigframework.impl.SignatureHandlerPPKLite.writeSignatureAfterSave(SignatureHandlerPPKLite.java:816)
            at com.adobe.internal.pdftoolkit.services.digsig.impl.SigningUtils.doSigning(SigningUtils.java:820)
            at com.adobe.internal.pdftoolkit.services.digsig.SignatureManager.certifyWrapperAPI(SignatureManager.java:1554)
            at com.adobe.internal.pdftoolkit.services.digsig.SignatureManager.certify(SignatureManager.java:1542)
            at com.adobe.livecycle.signatures.service.impl.SignCertifyImpl.certify(SignCertifyImpl.java:894)
            at com.adobe.livecycle.signatures.service.impl.DocumentSecurityService.certify(DocumentSecurityService.java:1644)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:498)
            at com.adobe.idp.dsc.component.impl.DefaultPOJOInvokerImpl.invoke(DefaultPOJOInvokerImpl.java:118)
            ... 102 more

 

Based on the above messages, it is clear that there is a problem with some of the bouncycastle classes. This kind of thing is usually a missing class (“ClassNotFoundException“) or a conflict between two or more versions that are loaded by WebLogic (“NoSuchMethodError“) with the loaded/active version not containing the specific java method that is being called. We opened a SR with the Adobe Support (#188938) because this kind of thing shouldn’t be happening but after a few days without any meaningful update from them, I decided to look into the product myself to stop losing time on such trivial thing.

So this specific class (“org.bouncycastle.asn1.x509.AlgorithmIdentifier“) can be found in numerous jar files: apacheds*.jar, bcprov*.jar, bouncycastle*.jar, ec2*.jar, aso… I checked all these jar files on our WebLogic Server libraries as well as AEM ones and found what I believe was the issue: different versions of these jars being loaded. To confirm and before changing anything, I deployed the WebLogic CAT and found:

  • 0 conflicts in adobe-livecycle-cq-author.ear
  • 0 conflicts in adobe-livecycle-native-weblogic-x86_linux.ear
  • 5339 conflicts in adobe-livecycle-weblogic.ear

 
These numbers pretty much confirmed what I thought already. Going further, I found a few hundred conflicts related to the “org.bouncycastle.*” classes only. One of these being for the class “org.bouncycastle.asn1.x509.AlgorithmIdentifier” and it was conflicting between the following files:

  • WebLogic: $MW_HOME/oracle_common/modules/org.bouncycastle.bcprov-jdk15on.jar (1st loaded)
  • WebLogic: $MW_HOME/oracle_common/modules/org.bouncycastle.bcprov-ext-jdk15on.jar
  • AEM: $APPLICATIONS/adobe-livecycle-weblogic.ear/bcprov-151.jar

 
So what should be done to fix this? Well, a simple solution is just to force WebLogic to use the AEM provided files by default by updating the load preferences:

[weblogic@aem-node-1 ~]$ cd $APPLICATIONS
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ jar -xvf adobe-livecycle-weblogic.ear META-INF/weblogic-application.xml
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ grep -B1 "</prefer-application-packages>" META-INF/weblogic-application.xml
<package-name>org.mozilla.javascript.xmlimpl.*</package-name>
</prefer-application-packages>
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ sed -i 's,</prefer-application-packages>,<package-name>org.bouncycastle.*</package-name>\n&,' META-INF/weblogic-application.xml
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ grep -B2 "</prefer-application-packages>" META-INF/weblogic-application.xml
<package-name>org.mozilla.javascript.xmlimpl.*</package-name>
<package-name>org.bouncycastle.*</package-name>
</prefer-application-packages>
[weblogic@aem-node-1 AEM]$ 
[weblogic@aem-node-1 AEM]$ jar -uvf adobe-livecycle-weblogic.ear META-INF/weblogic-application.xml
[weblogic@aem-node-1 AEM]$ rm -rf META-INF
[weblogic@aem-node-1 AEM]$

 

What the above commands are doing is simply to add “<package-name>org.bouncycastle.*</package-name>” just before the end of the “<prefer-application-packages>” section so that WebLogic will know that it needs to use the AEM provided classes for this package and it shouldn’t use its own files. Once that is done, simply redeploy the EAR file. In my case, I was left with “only” 2442 conflicts, none regarding the bouncycastle (obviously).

After that, executing the same Certify action with the new classloader preferences resulted in no more errors:

####<Aug 28, 2019 1:12:22,359 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '109' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-1475E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000071> <1566997942359> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 1:12:23,702 PM UTC> <Info> <com.adobe.livecycle.usermanager.sslauthprovider.SSLMutualAuthProvider> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-147BE729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997943702> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Got Subject DN as CN=aem-dev,OU=IT,O=dbi services,L=Delemont,ST=Jura,C=CH>
####<Aug 28, 2019 1:12:24,199 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944199> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>
####<Aug 28, 2019 1:12:24,199 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944199> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property LastCacheResetTime has been changed from  to 1555070921173>
####<Aug 28, 2019 1:12:24,200 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944200> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC001: The property CacheValidationTime has been changed from 0 to 1555070921058>
####<Aug 28, 2019 1:12:24,202 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944202> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,691 PM UTC> <Info> <Common> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14F2E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944691> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000628> <Created "1" resources for pool "IDP_DS", out of which "1" are available and "0" are unavailable.>
####<Aug 28, 2019 1:12:24,704 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944704> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,710 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944710> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,717 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944717> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,724 PM UTC> <Info> <com.adobe.formServer.common.cachemanager.CacheConfig> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944724> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <Initializing cache from default values >
####<Aug 28, 2019 1:12:24,928 PM UTC> <Info> <com.adobe.formServer.config.FormServerConfigImpl> <aem-node-1> <msAEM-01> <[ACTIVE] ExecuteThread: '116' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <BEA1-14A8E729A745050A7F40> <3a34648b-38e4-4ec5-8a0a-e6872bc1c6a1-00000072> <1566997944928> <[severity-value: 64] [rid: 0] [partition-id: 0] [partition-name: DOMAIN] > <BEA-000000> <FSC008: Using the database to access and persist configuration properties.>

 

The generated PDF now contained a correct Time-Stamp information but still not LTV information:

Certify PDF - TSP working & LTV failed

Finally, adding a Validation step after the Certify step in the process (in the AEM Application (LCA)) allowed both TSP and LTV information to be shown properly:

Certify PDF - TSP working & LTV working

Cet article AEM Forms – Certify PDF end-up with NoSuchMethodError on bouncycastle est apparu en premier sur Blog dbi services.

Pages