我参与编辑的图书:《Oracle数据库性能优化》出版了。排版上我感觉不是很好。因为这个出版社喜欢用Word排版(这样降低了成本),但是从阅读的角度上来看,就不够太贴心了。
样书我只拿到了一本,在公司放着,不知道被谁拿走了。
--EOF--我参与编辑的图书:《Oracle数据库性能优化》出版了。排版上我感觉不是很好。因为这个出版社喜欢用Word排版(这样降低了成本),但是从阅读的角度上来看,就不够太贴心了。
样书我只拿到了一本,在公司放着,不知道被谁拿走了。
--EOF--前几天发生的事情.系统是 9iR2 的 DataGuard . 注意到 Data Guard 主库上有的时候会在$BDUMP 目录下产生包含如下内容的trace 文件:
......
Destination LOG_ARCHIVE_DEST_2 is in CLUSTER CONSISTENT mode
Destination LOG_ARCHIVE_DEST_2 is in MAXIMUM PERFORMANCE mode
Destination LOG_ARCHIVE_DEST_2 is in CLUSTER CONSISTENT mode
Destination LOG_ARCHIVE_DEST_2 is in MAXIMUM PERFORMANCE mode
Destination LOG_ARCHIVE_DEST_2 is in CLUSTER CONSISTENT mode
......
以前遇到过一次,当时忽略掉了.不过今天忽然感觉之所以产生这样的Log还是比 较奇怪的. 找了半天 Metalink ,发现还真有对这个现象的解释:
因为某些原因,国外的很多 Blog 站点在国内都是看不到的,包括 Blogspot.BlogSpot 上有很多不错的 Oracle 专家的 Blog .比如 Thomas Kyte. 此外还有 Anjo Kolk 的Oraperf. Oraperf 的三位大牛Anjo Kolk、Shari Yamaguchi、Jim Viscusi曾经在10年前提出 Oracle 数据库优化的 YAPP(Yet Another Performance Profiling Method)方法。在这个Blog第一篇就是回顾YAPP发布后的十年(YAPP Ten years later: What has changed? ).以下内容为引用:
In 1995 I started to work on the wait events paper based on Oracle 7.3, and that formed the basis for the YAPP white paper. That in turn formed the basis for the wave of response time tuning books and presentations. Now it is funny to see how the same thing is rehashed over and over again and nothing really new is being added. In fact I believe that all the attention on wait events and response time or resource tuning in the Oracle RDBMS, is taking away the way the focus of performance problems that actually originate outside the database. The word "over exposure" comes to mind. Does this mean that wait events are no longer important? Well yes and no. Believe it or not, but most databases suffer from the same performance problems. They differ in the symptoms that they show. For example, many databases suffer from I/O performance problems and Oracle has quite a number of wait events that are directly and indirectly associated with I/O. So instead of approaching each of these events individualy, they could be grouped together to just show the symptom. In fact, Oracle 10g has finally done this and has introduced wait event classes (not new, a company Precise (now part of Veritas) did that first in their product back in 1998). So Oracle is still expanding the number of wait events, but at the same it is grouping them together again.
Before the response time tuning (and even today) people actually based on best practices (BP). Each best practice has a ratio associated with it. For example the Buffer Cache Hit Ratio, basically is the Best Practice that tells people to cache frequently used data (which is a good thing in principle). The problem with tuning today is that many Best Practices exists and they all have some kind of ratio associated with them. So if a performance problem occurs DBAs starting working on this list of Best Practices to check if something applies to their problem. There are a couple of problems with that:
1) Not every body may have the same list of Best Practices or the same threshold for the ratios.
2) The list of Best Practices may not be sorted in the same way for different people
The result is that the problem finding process takes a long time and is not really repeatable by different people (different lists, different ratios). So starting the problem finding process with Best Practices is like shooting a gun and hooping that you will hit some thing.
The Response Time tuning process is basically telling you what Best Practice to use. For example if the Response Time consists mostly of I/O relatated waits we should start looking at the Best Practices for I/O. If CPU is the most common resource consumption, we should start looking at the Logical I/Os .
In this approach the different wait event groups play an important role, because each group is basically assoicated with one or more Best Practices. We still do care about what wait events are actually in the group, but for selecting the right Best Practice it doesn't matter. For solving the problem, it may be important.
I believe that people should start thinking this way instead of worrying about the individual events. I am not saying that the individual events are not important, but keep an eye on the complete picture before diving into the indivual events.
So one of these days (may this year) I should write an update to the YAPP paper and hopefully it can be basisses for a wave in the response time tuning. Oh yeah, I don't see my self as the inventor of all this. As far as I am concerned, Response time Tuning has always been there, it was just not really accepted in the Oracle world as the way of tuning your Oracle database. So may be I started that, but I just wrote the paper(s) and other people like Mogens Norgaard made it popular. It is kind of funny that I actually still use the same method(s) almost 10 years later (and it doesn't matter if they are on instance, session or SQL statement level it is the same methodology, with different result but still solving the same problems; think about that one .....)
提到了所谓的 BP(best practices) 方法.想到曾经在一个站点上看到所谓的 HP 专家最欣赏"最佳实践"之类的说法不由得有点好笑.
还在 Oraperf 上看到了一则关于 Outer Edge of Disk 的帖子。 Oraperf's Blog: Outer Edge of Disk
The question is why one should place data on the outside of the disk and if that helps performance. I just like to add my view on that discussion. The number of physical I/Os per Disk is limited by the (full/average) seek time. The seek time is the biggest component of the I/O time (with normal Oracle blocksize). The Full Seek Time (seeking from the outermost track to the innermost track or vice versa) can be greatly reduced by only using the outermost tracks on a disk. Why the outermost tracks? Well there is this statistic that on the 1/3 of the outermost tracks there is around 50 percent of the disk capacity stored. If I only have to seek 1/3 of the distance to reach 50 percent of the data all the time, the average seek time will be greatly reduced. If the Seek time is greatly reduced, the total I/O time is greatly reduced and we can do more random I/O operations per second. So to summarize: 1) Use around 50 percent of the disk capacity (outer 1/3 of the tracks) 2) That will increase the number of random I/Os greatly.
从Mark Rittman 的 Blog 上看到,IBM 与 Hyperion "分手"了. 而以前,IBM 把 DB2 OLAP Server (OEM 版本的 Hyperion Essbase)作为自己 BI 解决方案中的重要一环.这下子估计有不少中国 Essbase 用户要面临迁移的问题了.
前不久,IBM 刚刚 收购 Ascential,当时我还说:
不过我倒是觉得 IBM 更应该收购的公司是 Hyperion ,把 Hyperion Essbase 变成真正的 IBM 产品才有意思一些
阅读AIX的手册,看到关于 Direct I/O 的一些描述:
直接 I/O 与正常高速缓存的 I/O
通常,JFS 或 JFS2 将文件页面高速缓存在内核存储器中。当应用程序执行文件读取请求时,如果文件页面不在内存中,则 JFS 或 JFS2 将数据从磁盘读取到文件高速缓存,然后将数据从文件高速缓存复制到用户缓冲区。对于应用程序的写操作,仅将数据从用户的缓冲区复制到高速缓存。以后执行对磁盘的实际写入。
当高速缓存的使用率比较高时,这种高速缓存策略可以发挥完全的效用。它还启用从头读取、从后写入的策略。最后,它使文件写操作异步,允许应用程序继续处理,而不是等待 I/O 请求完成。
直接 I/O 是一种可选的高速缓存策略,它使得文件数据直接在磁盘和用户缓冲区之间传送。文件的直接 I/O 在功能上等同于设备的原始 I/O。应用程序可以在 JFS 或 JFS2 上使用直接 I/O。
几天前发生的一件事情:因为一个小失误,折腾了一个下午.
事情开始是这样的:收到邮件被告知要 HDS 9970 Web 控制台加个 Monitor 的用户.想到缺省密码还没有修改.嗯,有安全问题,须改之.系统管理员要我用 Remote Administrator 这个垃圾工具登陆.不爽得很.这软件一个人登陆,其他登陆的人也能看到他的操作,就和远程会议一样.我修改密码,我拷贝,我粘贴,哎呀,给我报告了一个 http 500 的错误.尝试重新登陆,上不去了!用旧密码,不行,试着各种可能,不行,晕倒
问题都出在这个 Copy + Paste 上,但是现在后悔也晚了,当时真是想这不是没事找抽么......
打电话,联系厂家的支持人员,他们也没么好办法,到了机房,说要重新安装微码的软件盘,我找,我找,我找找找--找不到.
抱着试试看的态度在系统里找密码可能的存储位置,嘿,别说,还真的..........................................找到了
又折腾了一会儿,总算是恢复了
今天听 10g 的 Podcast ,提及 10gR2 对 PHP 提供进一步的支持.只是细节没听清楚--还需要锻炼听力.10gR1 对 php 已经提供了不错的支持.10gR2 居然还有增强,可见 Oracle 对 OpenSource 还是很有信心.
翻看 PDF 手册,看到了如下比较感兴趣的内容(10gR2的文档结构和以前有了很大变化):
控制文件的特性有了一些增强.在修改 MAXLOGFILES、 MAXLOGMEMBERS、 MAXLOGHISTORY和MAXINSTANCES这几个参数的时候,已经不用重新创建了。可用性有了一定提高。
表空间监控 表空间达到定义的临界值的时候可以自动告警。当然,这个是和 EM 结合在一起的.
Online segment shrink 功能进一步改进.现在可以支持Lob对象和IOT segment.
install 10g R2, it has been sitting on my desktop for two days now…Tom 透漏的一个重要的消息是他的 Expert One-on-one 第二版将会首先以 CD 的形式发布.目前Guru Tom 的写作进度:
- finish chapter 12 (new chapter that didn’t exist before on datatypes – who would have thought there is so much to say about datatypes..)
- edit chapter 6 (well, that’ll be this weekend)…
- do chapters 13, 14, 15 (that’ll be the end of volume I, new stuff for data unload/load, new stuff for parallel chapter that never exists and update partitioning to 2005)