2017年11月27日 星期一

Active DataGuard 出現 MRP0: Background Media Recovery terminated with error 1111 錯誤


錯誤訊息
一開始出現

MRP0: Background Media Recovery terminated with error 1274
Errors in file /ORA/db/diag/rdbms/lg_st/LG/trace/LG_pr00_2553.trc:
ORA-01274: cannot add datafile '+DATA_TAB03/lg/datafile/LG_TAB04.dbf' - file could not be created
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 24982363193
Tue Jun 12 17:40:06 2018

MRP0: Background Media Recovery process shutdown (LG)

後來出現

Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
MRP0: Background Media Recovery terminated with error 1111
Errors in file /ORA/11.2.4/diag/rdbms/ora28/LG/trace/LG_mrp0_19769.trc:
ORA-01111: name for data file 10 is unknown - rename to correct file
ORA-01110: data file 10: '/ORA/11.2.4/db/dbs/UNNAMED00010'
ORA-01157: cannot identify/lock data file 10 - see DBWR trace file
ORA-01111: name for data file 10 is unknown - rename to correct file
ORA-01110: data file 10: '/ORA/11.2.4/db/dbs/UNNAMED00010'
Managed Standby Recovery not using Real Time Apply
MRP0: Background Media Recovery process shutdown (LG)
Tue Nov 28 14:47:33 2017


表示 Active DataGuard  這邊的 db_file_name_convert 參數沒有設定好


----

以下是解決方式 , 可線上處理

1.先把 db_file_name_convert  設定好 

alter system set db_file_name_convert='+ORA/datafile','/ORA/file1' scope=spfile;
2.將備庫的standby_file_management參數修改為手動管理模式
 alter system set standby_file_management='MANUAL' scope=both;

3.把UNNAMED數據文件放到正確路徑
alter database create datafile '/ORA/11.2.4/db/dbs/UNNAMED00010' as '/ORA/file2/LG_TAB04.dbf';
  

4.將備庫的standby_file_management參數修改回自動模式
ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=AUTO scope=both;

5.備庫啟動同步

要是線上處理直接執行 就可以
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;



---

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

ALTER DATABASE OPEN READ ONLY;

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;



6.完成



以上的方法比較快, 要是可以的話也可以重做 DATA GUARD


2017年9月10日 星期日

Redis cluster 不斷 crash







在有一個 Redis cluster 測試環境中, 發現cluster 狀態不太穩定, 看了一下Log發現大量出現以下訊息


5009:S 11 Sep 00:09:00.086 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
5009:S 11 Sep 00:11:11.086 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
5009:S 11 Sep 00:16:05.098 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
5009:S 11 Sep 00:16:30.030 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.
5009:S 11 Sep 00:21:37.687 # Cluster state changed: fail
5009:S 11 Sep 00:21:38.772 # Cluster state changed: ok
5009:S 11 Sep 00:21:38.810 * FAIL message received from 903ab5e1fb428082550d14de7b1bf97830bf8ad9 about 00b03db3d55d9c9045023d5e620c6f5d68ccab01
5009:S 11 Sep 00:21:41.368 # Start of election delayed for 754 milliseconds (rank #0, offset 33496646).
5009:S 11 Sep 00:21:41.368 # Cluster state changed: fail
5009:S 11 Sep 00:21:42.169 # Starting a failover election for epoch 72.
5009:S 11 Sep 00:21:42.248 # Failover election won: I'm the new master.
5009:S 11 Sep 00:21:42.248 # configEpoch set to 72 after successful failover
5009:M 11 Sep 00:21:42.248 # Connection with master lost.
5009:M 11 Sep 00:21:42.248 * Caching the disconnected master state.
5009:M 11 Sep 00:21:42.248 * Discarding previously cached master state.
5009:M 11 Sep 00:21:42.845 * Slave 10.22.21.186:8101 asks for synchronization
5009:M 11 Sep 00:21:42.845 * Full resync requested by slave 10.22.21.186:8101
5009:M 11 Sep 00:21:42.845 * Starting BGSAVE for SYNC with target: disk
5009:M 11 Sep 00:21:42.846 * Background saving started by pid 25548



一直出現

5009:S 11 Sep 00:16:30.030 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

錯誤後 Redis 就開始進行節點切換 Master - Slave 


原因: 
1.慢查詢   -- 這邊看了一下 慢查詢最慢也才  0.064 秒 , 不是這個問題
2.aof 檔案太大  -- bgrewriteaof 減少 size 大小, 但過10分後就又出現錯誤訊息 ,  雖然訊息出現的頻率已降低
3.cpu 忙碌 -- 此主機當下 Idle還在 98% 所以排除這個問題
4.硬碟太慢 -- 此Redis 是安裝在VM上,  Redis cluster instance 都在同一台主機上
可以調整 aof sync 頻率

參數: appendfsync
總共有三個值可以輸入 

always
everysec
no

官方解釋為

always :將 aof_buf 緩衝區中的所有內容寫入並同步到 AOF 文件。
everysec :將aof_buf 緩衝區中的所有內容寫入到AOF 文件, 如果上次同步AOF 文件的時間距離現在超過一秒鐘, 那麼再次對AOF 文件進行同步, 並且這個同步操作是由一個線程專門負責執行的。
no :將 aof_buf 緩衝區中的所有內容寫入到 AOF 文件, 但並不對 AOF 文件進行同步, 何時同步由操作系統來決定。


要是資料安全性不用太高的話, 可以嘗試把此值設為 no




以下是 官方對於 Redis latency 的 troubleshooting與看法

https://redis.io/topics/latency








2017年7月11日 星期二

Oracle ADG+OGG 配置


在一般環境中很常看到 RAC+ Active Data Guard 或者 RAC +  Golden Gate 的配置

以下介紹 ADG + OGG 配置, 不從主庫這次從備庫同步到OGG的DB

*.RAC + ADG + OGG 跟這次所說的配置其實大同小異 , 這一篇看完  RAC + ADG + OGG 應該也就會建立了


環境說明

10.10.17.27 主庫
10.10.17.28 備庫(Active Data Guard)
10.10.17.30 備庫(Golden Gate)


安裝 Active Data Guard 就不再說明,這邊就直接配置 Active Data Guard  (10.10.17.28)+ Golden gate (10.10.17.30) 同步操作



10.10.17.27 28 tnsnames.ora 內容


[oracle@ora28 admin]$ cat tnsnames.ora 
# tnsnames.ora Network Configuration File: /ORA/11.2.4/db/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.


ora27 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.10.17.27)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SID = orcl)
    )
  )

ora28 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.10.17.28)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SID = orcl)
    )
  )




1.(10.10.17.28) ADG 配置 mgr

設定 mgr
GGSCI (ora28) 1> edit param mgr

PORT 7809
dynamicportlist 7501-7505

autorestart extract *,waitminutes 2,retries 5

啟動 mgr
GGSCI (ora28) 3> start mgr

Manager started.


2.(10.10.17.28) ADG 配置 Extract

GGSCI (ora28) 4> dblogin userid ogg,password ogg
Successfully logged into database.

GGSCI (ora28 as ogg@LG) 5> 

GGSCI (ora28 as ogg@LG) 5> add extract ext1, tranlog,begin now
EXTRACT added.


GGSCI (ora28 as ogg@LG) 6> edit params ext1


#因為 28 是備庫 read only , 所以這邊這邊登入27 主庫,且只讀取 archivedlog
所以說必須等到主庫switch logfile , extract 才會擷取資料丟過去 30

extract ext1
userid ogg@ora27,password ogg
tranlogoptions archivedlogonly
tranlogoptions altarchivelogdest /ORA/ARCH
exttrail /ORA/ogg/dirdat/ex
discardfile /ORA/ogg/dirrpt/exta.dsc,append, megabytes 500

table VJ.*;


GGSCI (ora28 as ogg@LG) 7>  ADD EXTTRAIL /ORA/ogg/dirdat/ex, EXTRACT ext1                                                                      
EXTTRAIL added.


GGSCI (ora28 as ogg@LG) 8> 



3.(10.10.17.28) ADG 配置 第二個 extract (data dump)

GGSCI (ora28 as ogg@LG) 8> edit params pump1        


EXTRACT dump1
RMTHOST 10.10.17.30, MGRPORT 7809 TCPBUFSIZE 5000000
PASSTHRU
RMTTRAIL /ORA/ogg/dirdat/pt
NUMFILES 3000
TABLE VJ.*;

*rmthost :為目的端的IP及Port
*rmttrail :為目的端的接收資料檔的位置

#.這邊配置 要從哪裡取資料
GGSCI (ora28 as ogg@LG) 9> add extract  dump1, exttrailsource  /ORA/ogg/dirdat/ex
EXTRACT added.

GGSCI (ora28 as ogg@LG) 10> 

#.這邊配置要把檔案傳去Target 哪一個位置
GGSCI (ora28 as ogg@LG) 10> add rmttrail /ORA/ogg/dirdat/pt, extract dump1, megabytes 10

RMTTRAIL added.



啟動 dump 跟 ext

GGSCI (ora28 as ogg@LG) 17> start dump1

Sending START request to MANAGER ...

EXTRACT DUMP1 starting


GGSCI (ora28 as ogg@LG) 24> start ext1

Sending START request to MANAGER ...
EXTRACT EXT1 starting


GGSCI (ora28 as ogg@LG) 25> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     DUMP1       00:00:00      00:00:08    
EXTRACT     RUNNING     EXT1        01:52:30      00:00:01    



GGSCI (ora28 as ogg@LG) 26> 



4.(10.10.17.30) OGG 配置 checkpoing table 

GGSCI (datagu) 1> dblogin userid ogg,password ogg
Successfully logged into database.

GGSCI (datagu as ogg_tar@orclogg) 2> add checkpointtable ogg.ogg_chk

 Successfully created checkpoint table ogg.ogg_chk

將 checkpoint table 設定在 golden gate 裡頭

edit params ./GLOBALS

GGSCHEMA OGG
CHECKPOINTTABLE OGG.OGG_CHK



5.(10.10.17.30) OGG 配置 Peplicat

新增Peplicat
GGSCI (ora30 as ogg@LG) 2> add replicat rep01,exttrail /ORA/ogg/dirdat/pt,checkpointtable OGG.OGG_CHK
REPLICAT added.

*.這邊設定要從哪裡讀取從target丟過來的檔案

設定參數

GGSCI (ora30 as ogg@LG) 3> edit params rep01


replicat rep01
setenv (ORACLE_SID=orcl)
userid ogg,password ogg
assumetargetdefs
reperror default,discard
discardfile /ORA/ogg/dirdat/repl.dsc,append,megabytes 50
dynamicresolution
DDL INCLUDE ALL
DDLERROR DEFAULT IGNORE RETRYOP MAXRETRIES 3 RETRYDELAY 5
map VJ.*, target VJ.*;


6.啟動 (10.10.17.30) Peplicat                              

GGSCI (ora30 as ogg@LG) 4> start rep01

Sending START request to MANAGER ...
REPLICAT REP01 starting


GGSCI (ora30 as ogg@LG) 5> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     REP01       00:00:00      00:00:01    


GGSCI (ora30 as ogg@LG) 6> 


看一下有無錯誤

GGSCI (ora30 as ogg@LG) 7> view report rep01

.
.
.
.
.


Database Language and Character Set:
NLS_LANGUAGE     = "AMERICAN" 
NLS_TERRITORY    = "AMERICA" 
NLS_CHARACTERSET = "AL32UTF8" 

***********************************************************************
**                     Run Time Messages                             **
***********************************************************************


2017-07-12 13:45:53  INFO    OGG-02243  Opened trail file /ORA/ogg/dirdat/pt000000000 at 2017-07-12 13:45:53.977593.


看來有正常讀取到



7.主庫27進行操作 並且執行 alter system switch logfile; 再去看看備庫30有無資料










-----------

在RAC 環境時 ext1 還需要設定

TRANLOGOPTIONS altarchivelogdest primary instance orcl1 /u02/Archive/ORCL/, altarchivelogdest instance orcl2 /u03/Archive/ORCL/

來配置 rac1 及 rac2 的位置





2017年7月7日 星期五

Redis Cluster 到底可以掛幾台node 服務才不會掛?


這次實驗是使用 Redis Cluster 最基本的架構 3Master + 3Slave


這邊可以看到每一個Slave 後面掛的是它最主要同步的Master 的 id 


要是Master 掛掉會由它對應的Slave 會去接手變成Master , 那從Slave 變成Master 它本身也會從只讀 變成 可讀寫的Master


1. 原先的三台Master 全掛

可以看到Master 掛掉後 都被相對應的Slave 接手, 寫入資料沒問題

這時再掛掉一台Master , Cluster 服務也會隨即被關閉


2. 掛掉一台Master, 在掛掉一台沒有Slave 的Master

這邊可以看到 f710feca8f66b5fa64a98be8bf57d3e1ef372264 從Slave變成Master 但這台並沒有Slave 去對應


關掉 f710feca8f66b5fa64a98be8bf57d3e1ef372264 , 這時 Cluster 服務就會關閉


3. Slave 全關閉


Slave全關後, Cluster 服務還存在, 但這時候隨便掛掉一台 Master 服務就會停掉囉






2017年7月3日 星期一

Redis 新增 node (Master,Slave)



在原本Redis 三主三備 Cluster架構中 新增一組 Master Slave


1.啟動 新的 Redis

/redis/software/redis-3.2.9/src/redis-server /redis/software/8000/redis.conf
/redis/software/redis-3.2.9/src/redis-server /redis/software/8001/redis.conf

[root@localhost 8000]# ps -ef|grep redis
root      9360     1  0 12:17 ?        00:00:03 /redis/software/redis-3.2.9/src/redis-server *:8000 [cluster]
root      9366     1  0 12:17 ?        00:00:01 /redis/software/redis-3.2.9/src/redis-server *:8001 [cluster]


2.把新的Redis node 加進去 Redis Cluster 中 
#前面是新的node , 後面是原有的node

redis-trib.rb add-node 10.22.22.174:8000 10.22.22.174:7000
redis-trib.rb add-node 10.22.22.174:8001 10.22.22.174:7000



成功畫面, 要是出現 [ERR] Node 10.22.22.174:8001 is not empty. Either the node already knows other nodes (check with CLUSTER NODES) or contains some key in database 0. 錯誤

請先刪除原有的 node.log 檔案, 再次加入就可以

加入完成會出現 兩個沒有分配 Slot 的Master 的node

以下是 8000及8001 Port 



3.分配Redis Port 8000 3000個 Slot

redis-trib.rb reshard 10.22.22.174:8000

[root@localhost 8001]# redis-trib.rb reshard 10.22.22.174:8000
>>> Performing Cluster Check (using node 10.22.22.174:8000)
M: 086e40ce9139e625cea92da35619094e2c4a059f 10.22.22.174:8000
   slots: (0 slots) master
   0 additional replica(s)
S: 661f01b51c3aa508b9a990d123e24e6e1378ff61 10.22.22.174:7001
   slots: (0 slots) slave
   replicates f4eb29d1f76b0a29bec4c529377277b6e86a326e
M: 0d09b3e355be6b528c902af6a217a369fc4f4ef1 10.22.22.174:8001
   slots: (0 slots) master
   0 additional replica(s)
M: 493d6e53aa3e263812eb72bbd858784d4d9e88ba 10.22.22.167:7000
   slots:999-5460 (4462 slots) master
   1 additional replica(s)
S: 221ffb33e53faabc8805d79ed2cc39ca720d3698 10.22.22.186:7001
   slots: (0 slots) slave
   replicates 493d6e53aa3e263812eb72bbd858784d4d9e88ba
M: f4eb29d1f76b0a29bec4c529377277b6e86a326e 10.22.22.174:7000
   slots:0-998,5461-6461,10923-16383 (7461 slots) master
   1 additional replica(s)
S: 9b01d87f22fa6131adf8e3f136f4d63680e63892 10.22.22.167:7001
   slots: (0 slots) slave
   replicates af0061956f918d9d1cdb512465e26e3f8600b5c7
M: af0061956f918d9d1cdb512465e26e3f8600b5c7 10.22.22.186:7000
   slots:6462-10922 (4461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
### 這邊輸入 要分配的 Slot 的數, 這邊分配 3000個
How many slots do you want to move (from 1 to 16384)? 3000
### 這邊輸入哪一個要接收這3000個的node
What is the receiving node ID? 086e40ce9139e625cea92da35619094e2c4a059f
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
##這邊輸入 all 代表 其他節點要自己分配掏出3000個, 也可以從單個節點索取
Source node #1:all




輸入完 Yes , 它就會自動分配了


查看一下 目前Cluster 狀態, 可以看到Port  8000 已經被分配了 Slot




4.配置 Slave

為剛剛的Port 8000 配置一個 Slave

直接登入 8001 輸入cluster replicate  Master id
redis-cli -p 8001 cluster replicate 086e40ce9139e625cea92da35619094e2c4a059f



這邊就可以看到 8001已經是 8000的Slave 了

4個Master 4個Slave


---------------
刪除 Master node

1.先把該Master node 上的 Slot 移去別的Slot

redis-trib.rb reshard 10.22.22.174:8000

[root@localhost 8000]# redis-trib.rb reshard 10.22.22.174:8000
>>> Performing Cluster Check (using node 10.22.22.174:8000)
M: 086e40ce9139e625cea92da35619094e2c4a059f 10.22.22.174:8000
   slots:0-998,5461-6461,10923-11921 (2999 slots) master
   1 additional replica(s)
S: 661f01b51c3aa508b9a990d123e24e6e1378ff61 10.22.22.174:7001
   slots: (0 slots) slave
   replicates f4eb29d1f76b0a29bec4c529377277b6e86a326e
M: 493d6e53aa3e263812eb72bbd858784d4d9e88ba 10.22.22.167:7000
   slots:999-5460 (4462 slots) master
   1 additional replica(s)
S: 221ffb33e53faabc8805d79ed2cc39ca720d3698 10.22.22.186:7001
   slots: (0 slots) slave
   replicates 493d6e53aa3e263812eb72bbd858784d4d9e88ba
M: f4eb29d1f76b0a29bec4c529377277b6e86a326e 10.22.22.174:7000
   slots:11922-16383 (4462 slots) master
   1 additional replica(s)
S: 3c09e774bee37e3c4843b791f3a51bb79bd867a5 10.22.22.174:8001
   slots: (0 slots) slave
   replicates 086e40ce9139e625cea92da35619094e2c4a059f
S: 9b01d87f22fa6131adf8e3f136f4d63680e63892 10.22.22.167:7001
   slots: (0 slots) slave
   replicates af0061956f918d9d1cdb512465e26e3f8600b5c7
M: af0061956f918d9d1cdb512465e26e3f8600b5c7 10.22.22.186:7000
   slots:6462-10922 (4461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
### 這邊輸入該Master 擁有的Slot 數
How many slots do you want to move (from 1 to 16384)? 3000
### 這邊輸入誰要接收這3000個Slot
What is the receiving node ID? f4eb29d1f76b0a29bec4c529377277b6e86a326e
Please enter all the source node IDs.
  Type 'all' to use all the nodes as source nodes for the hash slots.
  Type 'done' once you entered all the source nodes IDs.
###這Slot來源是哪一個Master 這邊輸入要移除的Master id
Source node #1:086e40ce9139e625cea92da35619094e2c4a059f
Source node #2:done

2.移除節點

redis-trib.rb del-node 10.22.22.174:8000 '086e40ce9139e625cea92da35619094e2c4a059f'  

3.移除Salve

redis-trib.rb del-node 10.22.22.174:8001 '3c09e774bee37e3c4843b791f3a51bb79bd867a5'  

>>> Removing node 3c09e774bee37e3c4843b791f3a51bb79bd867a5 from cluster 10.22.22.174:8001
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.



2017年6月20日 星期二

Postgres 9.2 升級 9.6 使用pg_upgrade工具


Postgres 升級有數個方案可以使用

1.使用 pg_dump 跟pg_restore 做匯出匯入,但資料量大的話匯入時間就會拉長
2.使用 pg_upgrade 
3.使用同步 舊版本向新版本資料同步 , 但我這邊 9.2 - 9.6 版本差異太大,會出現以下錯誤

pg_basebackup: incompatible server version 9.2.15; client does not support streaming from server versions older than 9.3

所以我這邊使用 pg_upgrade 升級

Postgres 升級 9.2 > 9.6 

使用 pg_upgrade 指令



環境說明

9.2
PGDATA : /datafile/9.2
bin : /usr/pgsql-9.2/bin


9.6
PGDATA : /datafile/9.6
bin : /usr/pgsql-9.6/bin



1.先安裝 Postgres 9.6 

使用initdb 建立一個新的 9.6 資料庫

-bash-4.2$ /usr/pgsql-9.6/bin/initdb -D /datafile/9.6

The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "zh_TW.UTF-8".
The default database encoding has accordingly been set to "UTF8".
initdb: could not find suitable text search configuration for locale "zh_TW.UTF-8"
The default text search configuration will be set to "simple".

Data page checksums are disabled.

fixing permissions on existing directory /datafile/9.6 ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /usr/pgsql-9.6/bin/pg_ctl -D /datafile/9.6 -l logfile start



2.停止 9.2 資料庫

-bash-4.2$ pg_ctl stop -m i
正在等候伺服器關閉.... 完成
伺服器已停止
-bash-4.2$ 
-bash-4.2$ 



3.使用 pg_upgrade 指令升級



/usr/pgsql-9.6/bin/pg_upgrade -d /datafile/9.2 -D /datafile/9.6 -b /usr/pgsql-9.2/bin/ -B /usr/pgsql-9.6/bin/
-d 指定舊版本的資料夾 -D 指定新版本資料夾


-bash-4.2$ 

-bash-4.2$ /usr/pgsql-9.6/bin/pg_upgrade -d /datafile/9.2 -D /datafile/9.6 -b /usr/pgsql-9.2/bin/ -B /usr/pgsql-9.6/bin/

Performing Consistency Checks

-----------------------------

Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* system OID user data types                ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for roles starting with 'pg_'                      ok
Checking for invalid "line" user columns                    ok
Creating dump of global objects                             ok
Creating dump of database schemas
                                                            ok
Checking for presence of required libraries                 ok
Checking database user is the install user                  ok
Checking for prepared transactions                          ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
------------------
Analyzing all rows in the new cluster                       ok
Freezing all rows on the new cluster                        ok
Deleting files from new pg_clog                             ok
Copying old pg_clog to new server                           ok
Setting next transaction ID and epoch for new cluster       ok
Deleting files from new pg_multixact/offsets                ok
Setting oldest multixact ID on new cluster                  ok
Resetting WAL archives                                      ok
Setting frozenxid and minmxid counters in new cluster       ok
Restoring global objects in the new cluster                 ok
Restoring database schemas in the new cluster
                                                            ok
Setting minmxid counter in new cluster                      ok
Copying user relation files
                                                            ok
Setting next OID for new cluster                            ok
Sync data directory to disk                                 ok
Creating script to analyze new cluster                      ok
Creating script to delete old cluster                       ok

Upgrade Complete
----------------
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
    ./analyze_new_cluster.sh

Running this script will delete the old cluster's data files:
    ./delete_old_cluster.sh
-bash-4.2$ 

完成!!!!!


可以看到最後的訊息, Postgres 正在檢查及複製資料檔案



4.啟動 9.6 版本資料庫


-bash-4.2$ /usr/pgsql-9.6/bin/pg_ctl -D /datafile/9.6 -l logfile start

server starting

-bash-4.2$ 

-bash-4.2$ 

-bash-4.2$ ps -ef|grep postgres
root      3048  3009  0 15:05 pts/1    00:00:00 su - postgres
postgres  3049  3048  0 15:05 pts/1    00:00:00 -bash
root      4301  2666  0 15:56 pts/0    00:00:00 su - postgres
postgres  4302  4301  0 15:56 pts/0    00:00:00 -bash
postgres  4708     1  0 16:09 pts/0    00:00:00 /usr/pgsql-9.6/bin/postgres -D /datafile/9.6
postgres  4709  4708  0 16:09 ?        00:00:00 postgres: logger process   
postgres  4711  4708  0 16:09 ?        00:00:00 postgres: checkpointer process   
postgres  4712  4708  0 16:09 ?        00:00:00 postgres: writer process   
postgres  4713  4708  0 16:09 ?        00:00:00 postgres: wal writer process   
postgres  4714  4708  0 16:09 ?        00:00:00 postgres: autovacuum launcher process   
postgres  4715  4708  0 16:09 ?        00:00:00 postgres: stats collector process   
postgres  4716  4302  0 16:09 pts/0    00:00:00 ps -ef
postgres  4717  4302  0 16:09 pts/0    00:00:00 grep --color=auto postgres
-bash-4.2$ 
-bash-4.2$ /usr/pgsql-9.6/bin/psql
psql (9.6.3)
輸入 "help" 顯示說明。

postgres=# 
postgres=# \l
                                     資料庫清單
   名稱    |  擁有者  | 字元編碼 |   Collate   |  轉換型別   |       存取權限        
-----------+----------+----------+-------------+-------------+-----------------------
 TEST       | TEST      | UTF8     | zh_TW.UTF-8 | zh_TW.UTF-8 | 
 postgres  | postgres | UTF8     | zh_TW.UTF-8 | zh_TW.UTF-8 | 
 template0 | postgres | UTF8     | zh_TW.UTF-8 | zh_TW.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | zh_TW.UTF-8 | zh_TW.UTF-8 | postgres=CTc/postgres+
           |          |          |             |             | =c/postgres
(4 筆資料列)

postgres=# 
postgres=# 



5.配置 9.6 postgres.conf、pg_hba.conf 文件及修改.bash_profile




6.重新收集統計資訊


這時9.6 資料夾下有一個 analyze_new_cluster.sh , 是postgres 自動生成的檔案


-bash-4.2$ cd /datafile/9.6/

-bash-4.2$ 

-bash-4.2$ ls -al

總計 144

drwx------. 20 postgres postgres  4096  6月 20 16:14 .
drwxr-xr-x   5 root     root      4096  6月 20 15:22 ..
-rwx------   1 postgres postgres   755  6月 20 16:07 analyze_new_cluster.sh
drwx------   6 postgres postgres  4096  6月 20 16:03 base
-rwx------   1 postgres postgres    34  6月 20 16:07 delete_old_cluster.sh
drwx------   2 postgres postgres  4096  6月 20 16:15 global
-rw-------   1 postgres postgres   364  6月 20 16:14 logfile
drwx------   2 postgres postgres  4096  6月 20 16:03 pg_clog
drwx------   2 postgres postgres  4096  6月 20 15:54 pg_commit_ts
drwx------   2 postgres postgres  4096  6月 20 15:54 pg_dynshmem
-rw-------   1 postgres postgres  4502  6月 20 15:55 pg_hba.conf
-rw-------   1 postgres postgres  1636  6月 20 15:54 pg_ident.conf
drwx------   2 postgres postgres  4096  6月 20 16:03 pg_log
drwx------   4 postgres postgres  4096  6月 20 15:54 pg_logical
drwx------   4 postgres postgres  4096  6月 20 15:54 pg_multixact
drwx------   2 postgres postgres  4096  6月 20 16:14 pg_notify
drwx------   2 postgres postgres  4096  6月 20 15:54 pg_replslot
drwx------   2 postgres postgres  4096  6月 20 15:54 pg_serial
drwx------   2 postgres postgres  4096  6月 20 15:54 pg_snapshots
drwx------   2 postgres postgres  4096  6月 20 16:14 pg_stat
drwx------   2 postgres postgres  4096  6月 20 16:16 pg_stat_tmp
drwx------   2 postgres postgres  4096  6月 20 15:54 pg_subtrans
drwx------   2 postgres postgres  4096  6月 20 15:54 pg_tblspc
drwx------   2 postgres postgres  4096  6月 20 15:54 pg_twophase
-rw-------   1 postgres postgres     4  6月 20 15:54 PG_VERSION
drwx------   3 postgres postgres  4096  6月 20 16:07 pg_xlog
-rw-------   1 postgres postgres    88  6月 20 15:54 postgresql.auto.conf
-rw-------   1 postgres postgres 22233  6月 20 16:11 postgresql.conf
-rw-------   1 postgres postgres    49  6月 20 16:14 postmaster.opts
-rw-------   1 postgres postgres    77  6月 20 16:14 postmaster.pid
-bash-4.2$ 
-bash-4.2$ ./analyze_new_cluster.sh 

-bash-4.2$ ./analyze_new_cluster.sh 

This script will generate minimal optimizer statistics rapidly

so your system is usable, and then gather statistics twice more

with increasing accuracy.  When it is done, your system will

have the default level of optimizer statistics.

If you have used ALTER TABLE to modify the statistics target for
any tables, you might want to remove them and restore them after
running this script because they will delay fast statistics generation.

If you would like default statistics as quickly as possible, cancel
this script and run:
    "/usr/pgsql-9.6/bin/vacuumdb" --all --analyze-only

vacuumdb: processing database "TEST": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "postgres": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "template1": Generating minimal optimizer statistics (1 target)
vacuumdb: processing database "TEST": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "postgres": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "template1": Generating medium optimizer statistics (10 targets)
vacuumdb: processing database "TEST": Generating default (full) optimizer statistics
vacuumdb: processing database "postgres": Generating default (full) optimizer statistics
vacuumdb: processing database "template1": Generating default (full) optimizer statistics

Done
-bash-4.2$ 
-bash-4.2$ 

其實它也就只有跑
/usr/pgsql-9.6/bin/vacuumdb" --all --analyze-only 
這個指令而已


7.刪除舊版本軟體及資料

執行 delete_old_cluster.sh , 這個也是剛剛執行 pg_upgrade 後產生的檔案




[root@localhost 9.6]# ./delete_old_cluster.sh 

[root@localhost 9.6]# 

[root@localhost 9.6]# 




查看一下 delete_old_cluster.sh  其實就是 rm 指令而已

-bash-4.2$ cat delete_old_cluster.sh 

#!/bin/sh



rm -rf '/datafile/9.2'

-bash-4.2$ 
-bash-4.2$ 


8.確認

-bash-4.2$ 
-bash-4.2$ 
-bash-4.2$ psql
psql (9.6.3)
輸入 "help" 顯示說明。

postgres=# 
postgres=# 
postgres=# \l
                                     資料庫清單
   名稱    |  擁有者  | 字元編碼 |   Collate   |  轉換型別   |       存取權限        
-----------+----------+----------+-------------+-------------+-----------------------
 TEST       | TEST      | UTF8     | zh_TW.UTF-8 | zh_TW.UTF-8 | 
 postgres  | postgres | UTF8     | zh_TW.UTF-8 | zh_TW.UTF-8 | 
 template0 | postgres | UTF8     | zh_TW.UTF-8 | zh_TW.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | zh_TW.UTF-8 | zh_TW.UTF-8 | postgres=CTc/postgres+
           |          |          |             |             | =c/postgres
(4 筆資料列)

postgres=# 
postgres=# 


9.6.3 確認無誤, 升級完成!!!!

(7.6G資料我大概15分內轉換完成)